State v. Rosado

OPINION OF THE COURT

Dineen A. Riviezzo, J.

Issue Presented

In this proceeding under article 10 of the Mental Hygiene Law, the petitioner, the Attorney General of the State of New York, moves in limine to exclude from the trial evidence any testimony concerning the results of an actuarial risk assessment instrument known as the STATIC-99. The STATIC-99 compares certain historical characteristics of a given sex offender to the characteristics of a group of documented offenders, in an effort to predict the subject offender’s risk of reoffending.

For the reasons which follow, the motion is granted. The court is persuaded that under the unique, bifurcated process of civil commitment which exists in New York State under article 10 of the Mental Hygiene Law, testimony concerning actuarial risk assessment is inappropriate at the trial phase. Moreover, during the course of the hearing held in this case, it became clear that the “norms” for interpreting the scores of the STATIC-99 were significantly altered by Dr. Karl Hanson, one of the developers of the STATIC-99, and his associates at the end of 2008, and the beginning of 2009. (Hanson, Helmus and Thornton, Reporting Static-99 in Light of New Research on Recidivism Norms, available at http://www.static99.org/pdfdocs/forum_article_ feb2009.pdf.) Due to the adoption of new norms and test protocols for the STATIC-99 in February 2009, the “general acceptance” of the STATIC-99 in the scientific community for its intended purpose (predicting recidivism rates of sex offenders) has been called into question.

Procedural History

This is a proceeding for civil commitment under Mental Hygiene Law article 10. Respondent has waived a probable cause *382hearing, and is presently awaiting trial. Prior to the date scheduled for trial, the petitioner Attorney General moved in limine to preclude from the trial phase any evidence of respondent’s score on the STATIC-99, after having been informed of respondent’s intention to introduce the results at trial. Actuarial risk assessment instruments (ARAs), including the STATIC-99 at issue on this motion, are designed to measure a sex offender’s risk of reoffending by compiling a “score” for an individual based on historical data. That score is then equated with a percentage of risk to reoffend as compared to a group of known repeat sex offenders over 5-, 10- and 15-year periods.

Petitioner argues that testimony concerning actuarial testing is not relevant to the issues properly placed before the trier of fact at the trial phase, which in summary is to determine whether respondent has a mental abnormality as that term is defined by article 10. In other words, petitioner argues that the STATIC-99 is not relevant in establishing any of the elements of the definition of mental abnormality. In this regard, the petitioner contends that ARAs are admissible, if at all, only at the second phase of an article 10 proceeding, when a determination must be made by the court as to whether the respondent must be confined for treatment, or granted strict and intensive supervision for treatment in the community (SIST).

Secondly, petitioner argues that the STATIC-99 is not generally accepted by the scientific community for use in determining the existence of a mental abnormality, and thus the use of expert testimony for that purpose should be precluded under Frye v United States (293 F 1013 [DC Cir 1923]).

Respondent argues that precluding testimony concerning the STATIC-99 is fundamentally unfair, since ARAs are customarily relied upon by the psychiatrists and psychologists employed by the New York State Department of Mental Health in screening convicted sex offenders for their eligibility under article 10. Moreover, the results of the STATIC-99 are, respondent alleges, useful to the trier of fact in determining whether or not the respondent has a mental abnormality. The fact that the STATIC-99 is but one clinical tool used in reaching a diagnosis does not, respondent maintains, affect its admissibility, but rather, only affects the weight to be accorded the results of the testing. Respondent observes that the Legislature, in enacting article 10, specifically acknowledged and endorsed the use of actuarial risk instruments, rendering them presumptively admissible. Respon*383dent maintains that no Frye issue was presented, as the use of actuarial risk instruments is not novel or “experimental” in the context of sex offender civil commitment proceedings, is universally endorsed by the scientific community and is in fact used routinely in other states with civil commitment statutes. Lastly, respondent contends that since a court of coordinate jurisdiction has already held that evidence of the STATIC-99 is relevant at trial in an unrelated article 10 proceeding, petitioner is barred by collateral estoppel from maintaining otherwise in this proceeding.

Following written submissions, the court directed a hearing as to all issues raised by petitioner’s application. The court held extensive hearings, at which five experts were called. Petitioner called three experts — Dr. Roger Harris, a psychiatrist licensed in New York and New Jersey with extensive clinical experience in the treatment of sex offenders, who has evaluated between 15 and 20 sex offenders in connection with proceedings under article 10; Dr. Richard Hamill, a clinical psychologist who has been, among other things, the Project Director of the Capital District Region Coalition for Sex Offender Management, and a past president of the New York State Alliance of Sex Offender Service Providers; and Dr. Kostas Katsavdakis, also a clinical psychologist with a specialty in sexual disorders, and an adjunct professor at John Jay College of Criminal Justice in the field of psychology, who has evaluated numerous sex offenders. Respondent’s two experts were, first, Dr. Joe Scroppo, a clinical psychologist in private practice, who is also an attorney, and who has extensive clinical experience in the use of ARAs, and second, Dr. Lawrence Siegel, a forensic psychiatrist with experience in the treatment of sex offenders in New Jersey state prisons, and with assessment of individual offenders under New Jersey’s civil commitment statute for sex offenders.

While the court found that all of the experts were credible witnesses, it did not, as is explained in more detail below, concur with all of their opinions. In addition, the court considered the posthearing written submissions of the parties, and consulted numerous scholarly articles received into evidence on the subject of actuarial testing in the context of civil confinement proceedings.

For the reasons which follow, the court grants the petitioner’s motion in limine.

*384The Need for a Frye Hearing: Elements and Burden of Proof

In general, the inquiry under Frye is “whether the accepted techniques, when properly performed, generate results accepted as reliable within the scientific community generally.” (People v Wesley, 83 NY2d 417, 422 [1994].) The burden of proving general acceptance in the relevant scientific community rests upon the proponent of the disputed testimony. (See Zito v Zabarsky, 28 AD3d 42 [2d Dept 2006]; People v Kanani, 272 AD2d 186 [1st Dept 2000], lv denied 95 NY2d 935 [2000].) Admissibility under Frye requires a showing that

(1) The expert is competent in the field of expertise which he or she purports to address at trial. This element is not disputed in this case.

(2) The testimony is based on scientific principles or procedures which have been sufficiently established to have gained general acceptance in the particular field involved. In this regard, the hearing court does not determine whether or not a novel scientific theory is reliable, but only whether it is generally accepted in the relevant scientific community. The emphasis is on “counting scientists’ votes.” (Wesley, 83 NY2d at 439 [Kaye, Ch. J., concurring].)

(3) The proffered expert testimony is “beyond the ken” of the jury (see Matott v Ward, 48 NY2d 455 [1979]; People v Cronin, 60 NY2d 430, 433 [1983]). It is not disputed by the parties, and it is evident, that the subject of actuarial testing is beyond the ken of the ordinary person.

(4) And, the testimony is relevant to the issues and facts of the individual case, and more probative than prejudicial. Evidence is relevant if it has any tendency in reason to prove the existence of any material fact, i.e., if it makes determination of the action more probable or less probable than it would be without the evidence. However, even if relevant, the probative value must outweigh the prejudice to the other side. A trial court may exercise its discretion and preclude “technically relevant” evidence “if its probative value is substantially outweighed by the danger that it will unfairly prejudice the other side or mislead the jury.” (People v Scarola, 71 NY2d 769, 777 [1988].)

In engaging in a Frye analysis, the court may consider scholarly articles on the subject matter for the purpose of understanding “general acceptance.” Petitioner indeed submitted numerous writings and journal articles on the subject of *385actuarial testing of sex offenders. Respondent objected, however, and in fact articles initially given to petitioner by respondent were received by the court with preservation of respondent’s objection for the record. Because Frye is concerned with “head counting” of experts, the state of knowledge in the profession is at issue, and scholarly articles and journals are therefore admissible as reflecting those matters which are generally accepted in the relevant scientific community. (See e.g. People v Wernick, 215 AD2d 50, 52 [2d Dept 1995], affd 89 NY2d 111 [1996]; Fraser v 301-52 Townhouse Corp., 57 AD3d 416 [1st Dept 2008] [plaintiffs placed in evidence nearly 40 articles, treatises and other published studies concerning the relationship between building dampness and mold and sickness in humans; defendants placed approximately 15 such publications in evidence].)

At the outset, an issue is raised by respondent as to whether the present dispute requires a Frye inquiry at all. Respondent argues, and it is not contested, that ARAs do not involve novel science, but are in fact generally accepted in the scientific community in the context of sex offender civil commitment proceedings. As stated in Morosco, The Prosecution and Defense of Sex Crimes § 41.09 (2) (Matthew Bender & Company, Inc., Lexis-Nexis Group),

“Courts have split on whether a Frye hearing should even be held on the use of actuarial risk assessment tools. However, there is apparent unanimity . . . [that] actuarial instruments are admissible in SVP [sexually violent predator] proceedings, regardless of whether Frye applies as a threshold factor, because they are ‘generally accepted.’ As of 2004, experts in 20 states were allowed to rely on such tools. The same is true when the Daubert test for expert testimony is applied. Finally, many states have mandated the creation of actuarial risk tables for their own SVP proceedings, or the use of specified actuarial instruments.”

Petitioner, as previously noted, does not contest that ARAs, in general, are based on scientifically accepted principles. But petitioner does argue that New York’s statutory scheme for the civil management of sex offenders is unique among the states which have adopted similar laws in that in New York State the issues are bifurcated, with separate determinations as to the existence of a mental abnormality and the determination of the form of treatment. In New York, the issue of mental abnormality is tried by a jury (or the court if a jury is waived), and only *386after the existence of a mental abnormality is established does the court determine the manner of treatment, whether in confinement or supervision in the community. Petitioner argues that ARAs, which only predict the risk of reoffending, are relevant only at the second phase. Thus, the fact that other states may admit such testimony, or that other states have applied a Frye analysis, is not controlling, petitioner argues, since in New York State the issues at the initial trial phase do not involve the risk of reoffending. Petitioner maintains that the scientific acceptance of ARAs is limited to their use in measuring the risk of reoffending only, and not for the determination of mental abnormality.

This court is aware that in other article 10 proceedings, testimony concerning ARAs has been admitted both at the trial and in the dispositional phases of the proceedings. Other courts have excluded the testimony from trial. However, in none of these cases has a Frye hearing been conducted. Initially, when the statute was first enacted in April 2007, the Attorney General’s Office took the position that ARAs were admissible at trial; however, they now assert that their uniform statewide position is that ARAs are inadmissible regardless of the score. Although clearly each respondent is entitled to raise his or her own arguments, the Mental Hygiene Legal Service, designated in article 10 as the primary attorney for indigent respondents, has taken conflicting positions on this issue in different judicial departments. (See e.g. State of New York v Fox, Sup Ct, Wayne County, index No. Conf-1346 [respondent moves to exclude testimony concerning actuarial risk assessment instruments including the STATIC-99 at trial].)

Frye is concerned with the introduction of novel scientific evidence. There is no question that ARAs are scientifically accepted as a means of predicting recidivism, i.e., an individual’s risk or danger of reoffending, and for that purpose it is not “novel.” However, the use of actuarial testing to determine the existence of a mental abnormality is indeed novel, and thus raises Frye concerns.

The present case is akin to Styles v General Motors Corp. (20 AD3d 338 [1st Dept 2005]), a negligence case in which plaintiffs sought to introduce expert testimony as to the results of “crash tests” performed on a motor vehicle, in order to establish that the vehicle was not properly designed. The experts employed two accepted crash tests in tandem to test the safety of a car. The court held that

*387“[although each phase of plaintiffs’ test is, separately viewed, a widely accepted technique, plaintiffs failed to demonstrate that the use of both tests, in combination, on the same vehicle, has gained general acceptance within the pertinent scientific community . . . Translating a roll-over accident into angles of pitch and roll, and dropping and pressurizing, entails scientific matters not within the knowledge of the ordinary juror, and therefore must be demonstrated to be sufficiently established to have gained general acceptance within the scientific community.” {Id. at 340.)

Here, similarly, the novel use of accepted actuarial testing must be gauged under Frye. Moreover, while other states have determined that ARAs pass muster under a Frye analysis in civil commitment proceedings, those inquiries were limited to determining whether the use of actuarial testing was scientifically accepted in predicting risk. (See e.g. Collier v State, 857 So 2d 943 [Fla Dist Ct App 2003] [ARAs used to assess offender are subject to Frye analysis]; In re Commitment of Simons, 213 Ill 2d 523, 821 NE2d 1184 [2004] [expert opinion testimony regarding propensity to commit acts of sexual violence in the future which is based in part on use of the risk assessment instruments satisfies the Frye test].)

In light of the split in the trial courts here in New York State, the unique nature of New York State’s bifurcated statute, and the divergent positions of the parties, a Frye hearing was necessary to understand the limits and proper application of these ARAs, specifically the STATIC-99.

In addition, as will be explained in greater detail below, it became clear during the course of the hearing that new norms promulgated in February 2009 by the creators of the STATIC-99 have raised issues concerning the meaning to be ascribed to the test, and the manner in which the results are to be evaluated. These new developments require that the new norms of the STATIC-99 be examined under Frye as to its continuing acceptance in the scientific community.

Understanding the STATIC-99

All five experts generally agreed with the reasons why the STATIC-99 was created, its general use and its limitations. As some in-depth knowledge of the working of the STATIC-99 is required in the present circumstances, the court summarizes the portions of the experts’ collective testimony, as well as the journal articles, that are relevant to these proceedings.

*388Since the mid-1990s, an increasing number of jurisdictions have instituted sexually violent predator (SVP) civil commitment statutes. These laws seek to identify, and treat, a select group of incarcerated sex offenders whom the state determines, through the use of psychological and psychiatric expert testimony, have some type of mental disorder which results in those individuals reoffending. While the specific language employed in these statutes differs, they all include a showing that the convicted sex offender is likely to commit sex offenses in the future. ARAs were developed in connection with sex offender proceedings in order to assist psychologists and psychiatrists in predicting the risk of an individual’s recidivism. The phrase “risk of recidivism” can also be expressed as a “likelihood to commit sex offenses in the future,” or a person’s “level of dangerousness.” (For a history of SVP laws and ARAs, see Screenivasan, Weinberger and Garrick, Expert Testimony in Sexually Violent Predator Commitments: Conceptualizing Legal Standards of “Mental Disorder” and “Likely to Reoffend,” 31 J Am Acad Psychiatry & L [No. 4] 471 [2003].)

The use of ARAs in connection with SVP proceedings emerged to counter what was believed to be the lack of accuracy in clinical judgments by mental health professionals. Clinical judgment involves using experience and training, along with an interview or evaluation of the client, to reach an opinion about risk. (Testimony of Dr. Siegel, transcript at 576.) By studying data concerning the traits or characteristics of known sex offenders who were released from prison and subsequently rearrested or reconvicted, certain traits or characteristics (risk factors) could be empirically demonstrated as associated with sexual recidivism, such that persons who shared those traits or characteristics could be identified as “high risk” to reoffend. Conversely, those who did not share those traits or characteristics could be classified as “low risk” to reoffend. Since ARAs are based solely on historical and empirical data, the results would not be prone to subjective error in the way that clinical judgment might be.

The STATIC-99 is a 10 item ARA created by R. Karl Hanson, Ph.D. and David Thornton, Ph.D. (See http://www.static99.org.) The ARA is called “static” because it looks only at static, i.e., unchanging, historical events. Like other ARAs, it compares the characteristics and history of a given person to the characteristics of a group. In the case of the STATIC-99, research was done based on a group of sex offenders who had been reconvicted, and certain characteristics were identified which correlated *389with recidivism. A test was then developed by which the individual sex offender’s characteristics could be compared with the group characteristics, to determine the extent to which the individual had characteristics associated with reoffending.

The “risk items” or “characteristics” which are contained in the STATIC-99 consist of demographic information (age of sex offender and living with a partner for more than two years), criminal history (charges and convictions for index and other sex and violent offenses), and victim questions (whether any victims were unrelated to the offender, “strangers,” or male).

According to a set of Coding Rules established by the makers of the ARA, a numerical score is compiled by assigning a numerical value to each risk item. The 2003 version of the STATIC-99 (available at http://www.static99.org/pdfdocs/static99-coding-rules_e71.pdf [visited May 28, 2009]) sets forth the following form for “coding,” followed by the ranking of the numerical scores into categories of risk of reconviction:

[[Image here]]

*390For example, in category 1 (“Young”), a point is given if the offender’s age is between 18 to 25 at the time of release to the community. Category 3 (“Index Non-sexual violence”) requires a point to be assessed if at the time of the conviction for the underlying or “index” sex offense, the offender was also convicted of a violent nonsexual offense such as robbery. In category 5, the evaluator counts the number of convictions and the number of charges and uses whichever number is higher to assign a point value, which can range from 0 to 3. Category 7 (“Non-contact Sex Offenses”) includes exhibitionism, child pornography and voyeurism. In category 9 (“Any Stranger Victims”), a point is added if respondent knew the victim for less than 24 hours, and thus is considered to be a stranger. (Testimony of Dr. Katsavdakis, transcript at 260-268.)

The total point score equates with a range of risk of recidivism, with 1 point being low risk; 2 or 3 points being moderate risk; 4 or 5 points being medium-high risk; and 6 points or more being high risk. The authors of the STATIC-99 report in the Coding Rules manual that although it is possible to score more than 6 points, there is no significant increase in recidivism rates for scores between 6 and 12. The STATIC-99 then sets forth a probability table by which the raw score can be related to a percentage risk of recidivism over 5-, 10- and 15-year periods. These tables accord with the general understanding by experts that rates of recidivism generally increase over time. For example, for the norms that existed in August 2008, a score of “4” corresponded with a percentage of recidivism of 26% at 5 years, 31% at 10 years, and 36% at 15 years. (STATIC-99 Coding Rules [rev 2003], respondent’s exhibit B.)

The STATIC-99, as it was first created, measured recidivism as a rate of reconviction, since it was based on data from a known population of reconvicted sex offenders. Thus, if anything, it grossly understated the risk of reoffending, since it is commonly understood that more crimes are committed than are reported, and more arrests are made than convictions obtained. (Rennison, Rape and Sexual Assault: Reporting to the Police and Medical Attention, 1992-2000, US Dept of Justice, Bur of Statistics [Aug. 2002 NCJ 194530]; testimony of Dr. Siegel, transcript at 613.)

As was made clear in the hearing testimony, the STATIC-99 has no predictive value for an individual. The STATIC-99 does not and cannot measure an individual’s risk of reoffending. Rather, it ranks an individual with a group sharing certain *391characteristics. In other words, the STATIC-99 score only indicates that a respondent has characteristics which correlate with a group of individuals whose rate of recidivism is “x” percent. The offender’s risk may be higher or lower than the probabilities estimated in the STATIC-99, depending on other risk factors not measured by the STATIC-99. As Dr. Siegel testified, ARAs place people in a risk group, but the individual may or may not have the same outcome of that group, and the margin of error increases as to the individual as compared to the group. (Transcript at 666; see also Vincent, Maney and Hart, The Use of Actuarial Risk Assessment Instruments in Sex Offenders, ch 6, at 84 [“They (the courts) should understand that it is impossible to make accurate estimates about the probability of an individual reoffending using these tests”].) What this means for the finder of fact is that a clinician cannot testify that this respondent’s rate of recidivism is “x” or “y” percent, and can only opine that respondent shares characteristics of a group of people found to recidivate at those percentages. To take the example given above then, a person who scored a “4” in August 2008 shared characteristics of a group of persons who offended at a rate of 36% over 15 years. However, any particular individual scoring a “4” might fall within the group of 36% who reoffend, or within the group of 64% who do not reoffend.

Dr. Hamill described the STATIC-99 as a “predictive” instrument, as opposed to a “descriptive” instrument, meaning that it cannot describe the person’s psychology, emotions, volitional capacities or personality dynamics. (Transcript at 149.) “Actuarial instruments do not measure psychological constructs such as personality or intelligence. In fact, they do not measure any personal attributes of the particular sex offender at all. Rather, they are simply actuarial tables — methods of organizing and interpreting a collection of historical data.” (In re Commitment of R.S., 339 NJ Super 507, 540, 773 A2d 72, 92 [2001], affd, 173 NJ 134, 801 A2d 219 [2002] [ARAs are admissible in evidence in a civil commitment proceeding under the New Jersey Sexually Violent Predator Act (NJ Stat Ann §§ 30:4-27.24 to 30:4-27.38), when such tools are used in the formation of the basis for a testifying expert’s opinion concerning the future dangerousness of a sex offender].) The term “psychological constructs” refers to how a person relates to other persons, how they deal with their emotions, remorse, or behavior. (Testimony of Dr. Katsavdakis, transcript at 297.)

For the purposes of our present inquiry, what this means is that the STATIC-99 does not distinguish between, nor can it *392explain, the reasons why a person might reoffend. For the population of repeat sex offenders who formed the control group, the STATIC-99 does not differentiate between those persons who intended to commit the crime, were motivated by passion, revenge or rage, came upon a crime of opportunity, or were compelled to offend due to an inability or serious difficulty in controlling their behavior due to a disease, defect or condition. Put more simply, the STATIC-99 does not distinguish between those who are at risk of reoffending due to a lack of ability to control their behavior — called “volitional impairment” — from those who are at risk of reoffending due to choice or opportunity. This is an essential distinction as will be discussed in more detail later, as the United States Supreme Court has held that only those persons who have serious difficulty in controlling their behavior due to a disease or defect can be the subject of civil confinement statutes. Those individuals who chose to reoffend, those considered to have “criminal tendencies,” are not subject to civil confinement statutes. (See Kansas v Hendricks, 521 US 346 [1997].)

The hearing testimony revealed that there are other drawbacks and inadequacies in the STATIC-99. It is only moderately accurate for the use intended. Indeed, the STATIC-99 Coding Rules (http://www.static99.org/pdfdocs/static-99-codingrules_e.pdf, at 3), promulgated by Dr. Hanson and others, forthrightly indicates that “[t]he weaknesses of the STATIC-99 are that it demonstrates only moderate predictive accuracy (ROC = .71) and that it does not include all the factors that might be included in a wide-ranging risk assessment (Doren, 2002).” Dr. Siegel defined “moderate predictive validity” as values ranging between .6 to .7 with 1 being “perfect” and .5 being “useless.” (See also Campbell, Sex Offenders and Actuarial Risk Assessments: Ethical Considerations, 21 Behav Sci & L [No. 2] 269 [2003].) One law journal opined that in predicting whether an individual is more likely than not to recidivate consistent with the group’s percentage rate of recidivism, “the STATIC-99 cannot do much better than a coin flip.” (Berlin, Galbreath, Geary and McGlone, The Use of Actuaríais at Civil Commitment Hearings to Predict the Likelihood of Future Sexual Violence, 15 Sexual Abuse: J Res & Treatment [No. 4] 377, 381 [Oct. 2003], petitioner’s exhibit 4, tab A.)

Moreover, “static” tests based exclusively on historical factors do not take into account dynamic (changing) factors in determining risk. (Miller, Amenta and Conroy, Sexually Violent Preda-

*393tors: Empirical Evidence, Strategies for Professionals, and Research Directions, 29 L & Hum Behav [No. 1] 29 [Feb. 2005].) The goal of actuarial risk assessment is to achieve an objective criteria for measuring risk, but there exists no accepted mechanism to take into account individual and dynamic factors before reaching a conclusion. For example, as Dr. Siegel testified, Jeffrey Dahmer, who was convicted of various sexual offenses and in fact consumed the body parts of his victims, would score only a 2 on the STATIC-99 (low risk), because the more deviant aspects of his crimes are not “risk factors” listed in the STATIC-99, and thus are not reflected by his score. (Transcript at 580.) For those individuals who were in long-term relationships and therefore did not earn an extra point in that category, there is no accounting for a relationship that might have been violent or abusive. (Testimony of Dr. Katsavdakis, transcript at 293.) On the other hand, an offender who scored “high risk,” but who suffered a stroke and is now paralyzed, would not present a heightened risk of reoffending. Other examples include “protective factors” that might decrease risk, such as the completion of sex offender treatment programs, advancing age, or the offender’s use of drug therapy to reduce sexual appetite. (Testimony of Dr. Siegel, transcript at 582, 671.) Currently, there is no uniformly recommended methodology for altering an ARA score upward or downward to account for these factors. Consequently, “if the score from an [ARA] is simply incorporated into a clinical judgment, absent any systematic, transparent procedure for doing so that is recommended by the authors of the scale, we run the risk of nullifying the advantage of objectivity by the use of the scale.” (Prentky, Janus, Barbaree, and Schwartz, Sexually Violent Predators in the Courtroom: Science on Trial, 12 Psychol, Pub Pol’y & L [No. 4] 357, 384 [2006].)

Nor does the STATIC-99 delineate recidivism rates by the type of offense committed. No mechanism exists to distinguish pedophiles and other persons with recognized deviancies from other sex offenders. For example, incest offenders recidivate at a significantly lower rate than offenders who target victims outside of the family. Child molesters who target male victims recidivate at a significantly higher rate than those targeting only female victims. (Harris and Hanson, Sex Offender Recidivism: A Simple Question, available at http://www.publicsafety. gc.ca/res/cor/rep/2004-03-se-off-eng.aspx, petitioner’s exhibit 10.) As explained in one scholarly journal,

“The current research of actuarial measures is *394highly reductionist, in collapsing most sex offenders into a single category. This profound disregard for the heterogeneity of sexual offenders may lead to serious errors in prediction. Even the most basic typologies (e.g., rapists and child molesters) are neglected. For example, child molesters are often motivated by sexual aspects of offending ... In contrast, rapists are often motivated by anger and commit nonsexual offenses. Lumping together all paraphilias and sex offenses confounds any attempt at meaningful interpretation. Unquestionably, more focused methods are needed that take into account both clinical conditions (e.g., paraphilias) and offense types.” (Rogers and Jackson, Sexually Violent Predators: The Risky Enterprise of Risk Assessment, 33 J Am Acad Psychiatry L 523, 526-527 [2008].)

New York State’s Bifurcated Trial Process for Civil Confinement of Sex Offenders

New York State appears to be unique among states with SVP statutes, in that it has established a bifurcated process for the adjudication of SVPs in need of treatment. An understanding and appreciation of this bifurcated process is essential in determining the way that scientific evidence may be employed under Mental Hygiene Law article 10.

Article 10 divides the process of adjudicating SVPs in need of treatment into two distinct phases. The first phase, requiring a trial, is concerned with establishing whether or not the respondent is a detained sex offender who suffers from a mental abnormality. (See Mental Hygiene Law § 10.07 [d].) Only after the trier of fact — a jury of 12, or the court if a jury trial is waived — has determined that the respondent suffers from a mental abnormality is the form of treatment to be considered by the court in the second, dispositional phase.

Mental abnormality is defined under Mental Hygiene Law § 10.03 (i) as “a congenital or acquired condition, disease or disorder that affects the emotional, cognitive, or volitional capacity of a person in a manner that predisposes him or her to the commission of conduct constituting a sex offense and that results in that person having serious difficulty in controlling such conduct.” Unfortunately, Mental Hygiene Law article 10 does not further define the terms employed in the definition of “mental abnormality.”

As the sometimes subtle meaning of these terms is crucial to the determination of the motion, and as the existing law and ex-

*395perience in other states, as well as developments in the field of psychology, informs an understanding of how to apply these terms, it is necessary to further refine our understanding of these concepts.

The definition of “mental abnormality” under New York State law encompasses what may be viewed as two separate components or elements — predisposition and volition. These two elements must be established during the trial phase. Specifically, the respondent’s alleged condition, disease or disorder must: (1) “predispose[ ] him or her to the commission of conduct constituting a sex offense,” and (2) result “in that person having serious difficulty in controlling such conduct.” (Mental Hygiene Law § 10.03 [i].)

With respect to the element of “predisposition,” Dr. Hamill defined a “predisposition” as an “inclination to engage in a behavior.” (Transcript at 157.) Dr. Harris described it as “enduring, doesn’t go away, like an alcoholic.” (Transcript at 66-67.) Dr. Siegel explained that pedophiles and paraphiliacs in particular are thought to be “predisposed” to commit sex offenses because they have urges to behave in a particular manner that is sexually arousing. Those afflicted with these conditions are aroused or find it gratifying to act in a particular manner, such as committing sex offenses against children, and they continue to act that way since “those desires remain present to one degree or another basically indefinitely.” (Transcript at 588.)

The second element, “serious difficulty in controlling conduct” is called “volitional impairment,” and is at the crux of the present controversy. Volitional impairment was described by Dr. Hamill as a “difficulty in conducting one’s self according to one’s wishes or desires” or “whether one has that willful control over one’s behavior.” (Transcript at 164.) Dr. Harris described it as “someone’s ability to choose to decide to do something.” (Transcript at 109.) Dr. Katsavdakis testified that it means “the inability to control one’s impulses when one . . . [is] . . . aroused or not being able to stop one’s self from acting.” (Transcript at 223.) Dr. Scroppo defined this phrase in part as “in the face of logical or obvious rational reasons not to do something ... a rational, normal, average person . . . would stop and someone who has difficulty controlling would continue forward.” (Transcript at 474.)

The two concepts of predisposition and volition are separate and distinct, like “apples and oranges.” (Testimony of Dr. Hamill, transcript at 165.) A disorder, like pedophilia, might

*396predispose someone to the commission of sex offenses, but the offender might have a great degree of control over the predisposition.

That part of the definition of “mental abnormality” which requires that the condition, disease or disorder “results in that person having serious difficulty in controlling such conduct” must be understood in light of the Supreme Court’s decisions on the subject, including Kansas v Hendricks (521 US 346 [1997]). In Hendricks, the Court noted the importance of the “volitional element” in civil commitment proceedings:

“A finding of dangerousness, standing alone, is ordinarily not a sufficient ground upon which to justify indefinite involuntary commitment. We have sustained civil commitment statutes when they have coupled proof of dangerousness with the proof of some additional factor, such as a ‘mental illness’ or ‘mental abnormality.’ These added statutory requirements serve to limit involuntary civil confinement to those who suffer from a volitional impairment rendering them dangerous beyond their control. The Kansas Act is plainly of a kind with these other civil commitment statutes: It requires a finding of future dangerousness, and then links that finding to the existence of a ‘mental abnormality’ or ‘personality disorder’ that makes it difficult, if not impossible, for the person to control his dangerous behavior. Kan. Stat. Ann. § 59-29a02 (b) (1994). The precommitment requirement of a ‘mental abnormality’ or ‘personality disorder’ is consistent with the requirements of these other statutes that we have upheld in that it narrows the class of persons eligible for confinement to those who are unable to control their dangerousness.” (Id. at 358 [emphasis added; citations omitted].)

Thus, for civil confinement, the court required that the mental condition must cause a volitional impairment.

The Court revisited the constitutionality of the Kansas statute in Kansas v Crane (534 US 407 [2002]). The Kansas Supreme Court interpreted the United States Supreme Court’s earlier majority opinion in Hendricks as requiring a finding that the defendant had a complete inability to control his dangerous behavior. In rejecting the requirement of a total inability to control behavior, the Crane court stated:

“Hendricks underscored the constitutional impor*397tance of distinguishing a dangerous sexual offender subject to civil commitment ‘from other dangerous persons who are perhaps more properly dealt with exclusively through criminal proceedings.’ 521 U.S., at 360. That distinction is necessary lest ‘civil commitment’ become a ‘mechanism for retribution or general deterrence’ — functions properly those of criminal law, not civil commitment. Id., at 372-373 (KENNEDY, J., concurring) . . . The presence of what the ‘psychiatric profession itself elassifie[d] ... as a serious mental disorder’ helped to make that distinction in Hendricks. And a critical distinguishing feature of that serious . . . disorder’ there consisted of a special and serious lack of ability to control behavior. “In recognizing that fact, we did not give to the phrase ‘lack of control’ a particularly narrow or technical meaning. And we recognize that in cases where lack of control is at issue, ‘inability to control behavior’ will not be demonstrable with mathematical precision. It is enough to say that there must be proof of serious difficulty in controlling behavior. And this, when viewed in light of such features of the case as the nature of the psychiatric diagnosis, and the severity of the mental abnormality itself, must be sufficient to distinguish the dangerous sexual offender whose serious mental illness, abnormality, or disorder subjects him to civil commitment from the dangerous but typical recidivist convicted in an ordinary criminal case.” (Id. at 412-413 [citation omitted].)

It is clear, then, that the volitional element serves to distinguish those sex offenders who have a mental “disease” from the “dangerous but typical recidivist convicted in an ordinary criminal case” who reoffends due to choice — whether it be a crime of opportunity or other intentional act. A person who does not have a disease or disorder that results in his having a serious difficulty in controlling his or her behavior cannot be civilly confined.

At the post-trial dispositional phase pursuant to Mental Hygiene Law § 10.07 (f), the issue to be détermined by the court is the manner of treatment, whether in confinement or under strict and intensive supervision for treatment in the community. The court must determine whether the respondent is either:

(1) a sex offender requiring strict and intensive supervision which is defined as “a detained sex offender who suffers from a *398mental abnormality but is not a dangerous sex offender requiring confinement” (Mental Hygiene Law § 10.03 [r]), or

(2) a dangerous sex offender requiring confinement, meaning “a person who is a detained sex offender suffering from a mental abnormality involving such a strong predisposition to commit sex offenses, and such an inability to control behavior, that the person is likely to be a danger to others and to commit sex offenses if not confined to a secure treatment facility.” (Mental Hygiene Law § 10.03 [e].)

There is no dispute in this proceeding that once the factfinder has determined that a mental abnormality exists, respondent is, at a minimum, subject to SIST since a “sex offender requiring strict and intensive supervision” is defined only as a sex offender with a mental abnormality who is not dangerous. The issue for the court is whether respondent’s predisposition and volitional impairment are so strong that he is “likely to be a danger to others and to commit sex offenses if not confined to a secure treatment facility.” This phrase refers to the degree of respondent’s dangerousness, or his risk of reoffending as discussed above in the context of the STATIC-99. Only upon a finding of both dangerousness and a likelihood of committing future offenses (i.e., risk of reoffending) is confinement for treatment required. Conversely, the sole finding that the respondent has a mental abnormality requires that SIST be imposed without any additional finding of dangerousness or risk of reoffending.

Thus, under New York’s statutory scheme, neither the risk of reoffending nor dangerousness is part of the definition of mental abnormality and need not be established at trial. The following chart summarizes the foregoing concepts:

[[Image here]]

*399As was testified by the experts at the hearing, New York State is the only state that allows the court the option of either confinement or supervision once a respondent is found to have a mental abnormality. In Kansas, for example, under the statute at issue in the Supreme Court opinions cited above, a finding of mental abnormality results only in confinement, and the definition of mental abnormality logically incorporates the risk/ dangerousness element. According to Dr. Katsavdakis, almost all of the states with SVP acts require secure confinement, while only a few, such as Texas, permit outpatient treatment. (Transcript at 221.) Only New York provides for the alternatives of either confinement or outpatient supervision.

In examining the legislative history, the court notes that the initial version of the law, as set forth in 2006 NY Assembly Bill A9282, included, as part of the definition of mental abnormality, a requirement that it be “likely that [the respondent] will commit a felony sex offense in the future.” The deletion of this part of the definition supports the petitioner’s arguments that the risk or likelihood of reoffending is not germane to the issue of mental abnormality, and is thus not relevant at the trial phase of an article 10 proceeding.

Further support for the argument that the risk of reoffending was meant to be separated from the finding of mental abnormality was presented by the testimony of Dr. Hamill, who testified that in early 2007 he consulted with former Governor Spitzer’s counsel on the creation of the Sex Offender Management and Treatment Act, in his capacity, inter alia, as president of the Association for the Treatment of Sexual Abusers. Dr. Hamill stated that in examining the commitment laws of 17 states and the District of Columbia, juries were perceived as inclined to find that a respondent had a mental abnormality where there was testimony that the respondent was sufficiently dangerous and highly likely to reoffend, and thus in almost 100% of the cases, a respondent would ultimately be confined. For that reason, “it made sense” to take the issue of dangerousness (i.e., risk of reoffending) away from the jury, and leave that issue — as well as the issue of treatment — to the court to resolve. (Transcript at 174.)

The STATIC-99 is Not Admissible at the Trial Phase of an Article 10 Proceeding

All of the experts who testified agreed that the STATIC-99 is generally accepted in the scientific community to predict rates *400of recidivism. In fact, the methodology employed by the STATIC-99 has been found to be admissible in evidence in civil commitment proceedings in a number of states for that purpose. (See e.g. In re Commitment of Simons, 213 Ill 2d 523, 536, 821 NE2d 1184, 1192 [2004]; In re Detention of Thorell, 149 Wash 2d 724, 757, 72 P3d 708, 726 [2003]; In re Commitment of Tainter, 259 Wis 2d 387, 655 NW2d 538, 2002 WI App 296 [2002], review denied 259 Wis 2d 101, 657 NW2d 707, 2003 WI 16 [2003]; In re Commitment of R.S., supra, 339 NJ Super 507, 513, 773 A2d 72, 75 [2001], affd 173 NJ 134, 801 A2d 219 [2002] [per curiam]; see Eric S. Janus and Robert A. Prentky, Forensic Use of Actuarial Risk Assessment with Sex Offenders: Accuracy, Admissibility and Accountability, 40 Am Crim L Rev 1443, 1472-1475 [2003].) The experts at the hearing also agreed that the results of the STATIC-99 are relevant at the dispositional phase, where the court must gauge dangerousness and the respondent’s risk of reoffending in order to make a choice between confined treatment or SIST.

Each of the five experts who testified at the hearing, including respondent’s experts, also agreed that ARAs, including the STATIC-99, do not diagnosis a condition, disease or defect. The STATIC-99 cannot, for example, diagnose paraphila, pedophilia or antisocial personality disorder, three common diagnoses in article 10 proceedings.

They also unanimously agreed that ARAs including the STATIC-99 cannot predict whether a particular condition “predisposes” a person to the commission of sex offenses. Dr. Katsavdakis stated that the STATIC-99 score isn’t relevant to predisposition. He indicated that some of the risk factors in the STATIC-99 (i.e., those relating to past sexual conduct) identify subject areas relevant to determining predisposition, but he observed that an actuarial score does not inform the expert as to whether the predisposition emanates from the disease itself. (Transcript at 231-232.)

Dr. Siegel said that the STATIC-99 might have something to say about how strong a predisposition is. (Transcript at 588-589.) But he agreed with the respondent’s other expert, Dr. Scroppo, that predisposition is particular to an individual and cannot be measured by the STATIC 99. (Testimony of Dr. Scroppo, transcript at 490.) As Dr. Hamill said, ARAs cannot quantify a predisposition. (Transcript at 163.)

The primary conceptual difficulty in the present context is whether the STATIC-99 has any validity in assessing volitional *401impairment, or the element of “serious difficulty in controlling behavior,” which is a component of the definition of “mental abnormality” under the statute. Most significantly, the respondent’s two experts agreed with the petitioner’s three experts that the STATIC-99 does not assess volitional impairment. As discussed in the section above, it cannot tell you whether a specific individual is volitionally impaired. (Testimony of Dr. Scroppo, transcript at 426; testimony of Dr. Siegel, transcript at 683.) The court finds that respondent is, in essence, advocating the use of the STATIC-99 as a screening tool in a manner inconsistent with the plain language of the statute, and has expressed general policy concerns contrary to the legislative intent underlying article 10.

Dr. Scroppo testified that his methodology in evaluating article 10 respondents consists of first finding whether an individual has a condition, disease or defect and then looking to see if the mental disorder predisposes the individual to commit sex offenses. (Transcript at 397.) He then looks at the STATIC-99 score and if it is low — which he understands to mean that the person is not likely to commit sex offenses — then he goes no further because he believes there is no need to assess volitional impairment since there is no likelihood that the individual is going to recommit a sex offense. On cross-examination, he expressed his belief that article 10 was only intended to be applied to people who are likely to commit another sex offense as a result of a mental abnormality. (Transcript at 433.) Only if the STATIC-99 score is high, will he make an independent determination as to whether the individual’s risk of reoffending results from a volitional impairment or for some other reason. While acknowledging that the STATIC-99 cannot assess volitional impairment (transcript at 426), Dr. Scroppo still asserted a “common sense” belief (transcript at 465) that a low STATIC-99 score “by definition” means that an individual is not having serious difficulty in controlling behavior, whereas a high STATIC-99 does not necessarily correlate with serious difficulty in controlling behavior. (Transcript at 519.) A low score, he hypothesized, suggests “some control . . . it’s rough, but it’s at least one indicator, one approximation.” (Transcript at 429.)

It does comport with common sense to suggest that a person who has serious difficulty in controlling his behavior is likely to reoffend. In this sense, there is a “correlation” or an “association” between these two concepts when expressed in this way. But here, too, not everyone who has serious difficulty in con*402trolling behavior will reoffend. Moreover, the correlation does not exist in the converse. A person who is likely to reoffend may or may not have a volitional impairment, i.e., “serious difficulty in controlling behavior.” A person may reoffend simply because that person chose to do so, or took advantage of an opportunity to offend, and not from any lack of volitional capacity.

As Dr. Katsavdakis explained, the STATIC-99 does not impart any psychological or clinical information as to why a person is reoffending, and does not identify the causal reasons why someone is likely to reoffend. (Transcript at 227.) Consequently, a “high” score indicates a high risk of reoffending, but does not mean that an individual with a high score has more difficulty in controlling his behavior, any more than a low score means a person has no serious difficulty in controlling his behavior. All three of petitioner’s experts testified that using the STATIC-99 for this purpose is not supported by the developers of the STATIC-99 and/or the literature, and as such Dr. Katsavdakis suggested that there are, in his mind, ethical implications to using the tool in this manner. (Transcript at 252.) Indeed, Dr. Scroppo conceded that he would need “to look at the literature” to see if the STATIC-99 could be employed in the manner in which he was using it (transcript at 465); and he was unaware whether Dr. Hanson, the creator of the STATIC-99, endorsed this use of the STATIC-99. (Transcript at 513-514.) Dr. Siegel, respondent’s second expert, flatly disagreed with Dr. Scroppo’s assertion that a person with a low STATIC-99 score “necessarily” has no “serious difficulty in controlling behavior,” and stated that if an expert said that, “I think that he was wrong. You need to individualize it” — meaning that volitional impairment, by definition, is specific to an individual. (Transcript at 654.)

The import of Dr. Siegel’s statement that “[y]ou need to individualize it” is particularly clear when it is recalled that the STATIC-99 does not in fact measure an individual’s risk of reoffending, but only expresses that an individual has some shared characteristics of a known group of reoffenders. Thus, the relationship between the test results and risk of reoffending is even more attenuated from the question as to whether a particular respondent has “serious difficulty in controlling behavior.”

Revisiting the list of the 10 risk factors set forth in the STATIC-99 also serves to illustrate that the total raw score on the STATIC-99 has no bearing on the issue of volitional control. *403For example, a respondent who was 26 years of age and knew the victim for 27 hours could score two points lower than a respondent who was 23 years of age and knew the victim for 20 hours. Clearly this numerical score, while relating to the risk of reoffending statistically, has very little to say concerning volitional impairment, even if the addition or deletion of these two points changed the individual’s risk from low to high, or visa versa. By way of further example, a diagnosed pedophile with a first arrest could likely receive a low STATIC-99 score, yet have a strong predisposition and lack volitional control, while an individual diagnosed with antisocial personality disorder receiving a STATIC-99 score on account of just one index sex offense could receive a high STATIC-99 score for numerous violent arrests or convictions and yet not have a mental abnormality. (Testimony of Dr. Harris, transcript at 49-51.)

Further, Dr. Scroppo’s approach violates the statutory scheme in New York. Supreme Court precedent requires (at a minimum) that a person have a condition, disease or defect that results ini an inability to control behavior in order for civil confinement to be imposed. In other words, his or her volitional capacity must be impaired by the condition, disease or defect. As Dr. Scroppo acknowledged, there has to be a “linkage between the condition and the serious difficulty in controlling conduct” (Transcript at 504.) Under the New York statute that “linkage” is required not only for confinement, but even if respondent is only placed on SIST as the volitional impairment is part of the definition of mental abnormality, regardless of whether respondent may or may not end up confined. Using the STATIC-99 to screen out individuals with a low STATIC-99 score without a determination of volitional impairment — as required for a finding of mental abnormality — removes those respondents eligible for treatment in the community under the statute, i.e., those persons who have a mental abnormality and by definition are subject to SIST but who have a theoretically lower level of dangerousness.

Dr. Siegel also acknowledged on cross-examination that the actuarial risk assessment instruments do not measure volitional impairment. (Transcript at 683.) Nevertheless, he opined that the STATIC-99 is still relevant because he believes that “you’re going to find more people who have difficulty in controlling behavior in the higher risk population than in the lower risk population.” (Transcript at 591, 645, 683.) His testimony was no more than a statement that among the group of high-risk *404persons who reoffend, you are more likely to find persons who have serious difficulty in controlling behavior. Dr. Siegel admitted that he did not know if this statement was supported by any reference to any tests, studies, or literature. (Transcript at 684.) In addition, even if accepted as true that the group of persons who are likely to reoffend may include many people who have serious difficulty in controlling behavior, this observation is not persuasive to the legal issues at trial, since the “serious difficulty in controlling behavior” with which we are concerned must result from a condition, disease or disorder in the first instance in a particular respondent, a finding that the STATIC-99 cannot make. Finally, it is evident that a mere “association” under accepted scientific principles is not sufficient to establish a causal effect. (Fraser v 301-52 Townhouse Corp., 57 AD3d 416 [1st Dept 2008] [although there is general agreement in the scientific community that dampness and mold are associated with health problems, the observed association was not strong enough to constitute evidence of a causal relationship; plaintiffs experts conclusions rejected under Frye analysis].)

Dr. Siegel, in essence, expressed a philosophical disagreement with the policy and procedures underlying article 10. Dr. Siegel noted under New Jersey’s civil commitment laws, confinement was the only possible treatment mechanism, and there would exist no reason to look at those with lower risk categories on a STATIC-99, since a likelihood of reoffending is a prerequisite for civil confinement in that state. (Transcript at 563.) He saw his role as identifying only the high risk offenders, for which the STATIC-99 was useful. From a policy perspective, he believed that you “don’t want to waste precious judicial resources providing treatment and strict supervision to lower risk individuals because its not going to appreciatively affect the rate of recidivism.” (Transcript at 614, 633.) He stated that, in his opinion, proof of a likelihood of recidivism should be required before the imposition of SIST, as even SIST imposes significant burdens on a respondent. (Transcript at 633.) His concern was “over inclusiveness,” and the STATIC-99, for him, gives context to the evaluation. Further, because it is difficult to determine volitional control — to determine whether “they choose to do it or they’re driven to do it” (transcript at 686), it makes “less sense” to him to try to make a determination about volitional control when the person is “less likely to engage in the proscribed behavior in the first place.” (Transcript at 684.)

*405Contrary to the expert’s valid policy preference, as already stated, the statute is clear that evidence of recidivism is not required to find a mental abnormality. An individual with a low risk of recidivism as indicated by ARAs and/or clinical judgment — or, indeed, with little (or theoretically no) likelihood of committing sex offenses in the future — may, consistent with the statute, be found to have a mental abnormality, and would thus be subjected to SIST. As stated in Mental Hygiene Legal Serv. v Spitzer (2007 WL 4115936, *13, 2007 US Dist LEXIS 85163, *43 [SD NY 2007], affd 2009 WL 579445, 2009 US App LEXIS 4942 [2d Cir 2009]):

“Article 10 itself makes clear, however, that not all persons who have a mental abnormality sufficient to meet the definition of a ‘sex offender requiring civil management’ require actual confinement. Under the statute, ‘sex offender requiring civil management’ is a catch-all category that includes individuals subject to the act who are not individuals ‘likely to be a danger to others and to commit sex offenses if not confined to a secure treatment facility.’ MHL §§ 10.03[e], [q], [r]. At least some individuals who fit this broader category will not be subject to detention, but only to parole — or probation-like conditions of supervision while at liberty in the community, even after a jury finding that they have been proven, by clear and convincing evidence, to be sex offenders in need of civil management.”

Respondent correctly argues that some degree of dangerousness is inherent in a finding of mental abnormality, such that an individual who is predisposed to commit sex offenses and has volitional impairment could be considered presumptively dangerous. (Respondent’s mem, dated May 15, 2009, at 11; People v Brooks, 19 Misc 3d 407 [2008] [Sup Ct, Kings County, Mullen, J.].) However, additional proof (beyond what is inherent in the definition of mental abnormality) that a respondent is dangerous or likely to reoffend is simply not required under New York’s statutory scheme for a finding of mental abnormality during the trial phase.

While a legislative scheme similar to New Jersey under which only those persons likely to reoffend were civilly committed might present an easier role for the expert, this is not the state of the law in New York, since it would circumvent an inquiry into the psychological structures of an entire class of respon*406dents who are deemed not to be sufficiently dangerous to require confinement but are still in need of treatment in the community.

Finally, numerous law journal articles examined by the court demonstrate that the relevant scientific community accepts actuarial testing only as predictive of risk of reoffending and cannot and should not be used to determine volitional capacity. For example as stated in one scientific journal, “no variables on either the actuarial methods or the structured clinical methods allow the clinician to draw conclusions regarding the volitionality of the offender’s behavior.” (Jackson, Rogers and Shuman, The Adequacy of Sexually Violent Predator Evaluations: Contextualized Risk Assessment in Clinical Practice, 3 Int’l J Forensic Mental Health [No. 2] 115, 126.) Similarly, “the area of volitional control is the element of sexual predator evaluations that would appear to have the least empirical support or scientific evidence. We have no data that identifies something that causes lack of control or degrees of control.” (Miller, Amenta and Conroy, Sexually Violent Predators: Empirical Evidence, Strategies for Professionals, and Research Directions, 29 L & Hum Behav [No. 1], at 49.) Further, “there exist no specialized tests or procedures that mental health professionals can use for this purpose [to assess volitional impairment]. They must ‘fly solo,’ relying on their own judgment or discretion.” (Hart and Kropp, Sexual Deviance and the Law, printed in Sexual Deviance, Theory, Assessment, and Treatment, ch 29; petitioner’s exhibit 16, tab 9, at 11.)

A key illustration of the fact that the relevant scientific community (psychiatrists and psychologists dealing with civil commitment of sex offenders) does not employ actuarial testing in the manner argued by respondent is contained in Jackson and Tess, Evaluation for Civil Commitment of Sex Offenders: A Survey of Experts (19 Sexual Abuse: J Res & Treatment [No. 4] 425 [2007]), which surveyed 41 experts who conduct sex offender civil commitment evaluations. With respect to the “volitional element” (i.e., serious difficulty in controlling behavior), the authors noted, “What is clear from the statutes, and the findings of Hendricks (1997) and Crane (2002), is that the presence of a mental disorder and determination of ‘high risk’ are not enough. The link between the two is the crux of the civil commitment.” (Id. at 428-429.) Significantly in surveying the methods used by experts to determine the presence of volitional impairment, not one expert indicated that actuarial testing was employed in forming a conclusion. The experts instead reached *407conclusions based on the “existence of a personality disorder combined with previous sex offending,” “existence or nonexistence of a paraphilia,” “self report that volitional impairment is present,” or other criteria not including actuarial testing. {Id. "at 435-436.) Actuarial testing was employed only in determining future risk of reoffending. {Id. at 434-435.) This evidence of scientist “head counting” is cogent evidence that risk status does not inform a conclusion as to volition.

Indeed, the relevant scientific community recognizes the real possibility of misleading the factfinder even when the ARAs are used exclusively for the only purpose for which they were intended, namely, quantifying risk:

“Mental health professionals need to be able to acknowledge not only what they are capable of doing, but also their own limitations. Some things may not yet be possible. Actuaríais can potentially be very misleading if one incorrectly attributes the overall risk of a previously screened group to a specific individual within it ... If that fact is not made sufficiently clear at a civil commitment hearing, then the prejudicial effects of actuarial data (i.e. its capacity to be misleading) may outweigh its probative value (i.e., its capacity to assist in determining truth). For that reason, the use of actuarial data should be restricted to the screening process, rather than introducing it as evidence of a given individual’s likely risk.” (Berlin, Galbreath, Geary and Mc-Glone, The Use of Actuaríais at Civil Commitment Hearings to Predict the Likelihood of Future Sexual Violence, 15 Sexual Abuse: J Res & Treatment [No. 4] 377, reprinted by Department of Psychiatry and Behavioral Sciences at Johns Hopkins University School of Medicine, at 7 [2002].)

Finally, the fact that the STATIC-99 is employed by the Office of Mental Health in the screening process to determine which convicted sex offenders should be subject to article 10, and further, that it is often relied upon by experts later called to testify at trial, is not significant, as the STATIC-99 is relevant in those circumstances to issues other than the existence of a mental abnormality, such as dangerousness and likelihood of reoffending. Similarly, the fact that it is routinely offered into evidence by the Attorney General during probable cause hearings, also has no bearing on the ultimate issues in this case. At a probable cause hearing, there must be an additional determination of

*408dangerousness to hold the respondent pending trial. The STATIC-99 is relevant and accepted for that determination. (See Mental Hygiene Legal Serv. v Spitzer, 2007 WL 4115936, 2007 US Dist LEXIS 85163 [2007], affd 2009 WL 579445, 2009 US App LEXIS 4942 [2009].) Lastly, while the Legislature encouraged the use of ARAs in its legislative findings (see Mental Hygiene Law § 10.05 [d], [e]), it did not provide that they are admissible at trial.

New STATIC-99 Norms: Problems with Application

Further complicating any use of the STATIC-99 during trial, or even at the dispositional phase, is the fact that new research by Dr. Hanson has resulted, in essence, in a total revamping of the STATIC-99 and the manner in which it may be used. While this issue was not the original focus of the arguments of the parties, the new norms were discussed by various experts at the hearing, and the apparent confusion and consternation as to the use of the new norms voiced at the hearing illustrates the current state of flux as to the application of the STATIC-99 for predicting recidivism in civil management proceedings.

Dr. Katsavdakis testified that the original STATIC-99 norms, which were employed until approximately the end of 2008, had been developed from samples of sex offenders who were initially released from incarceration in the 1960s, 1970s and 1980s. However, since the mid-1990s the rate of violent crimes has decreased, making the actual rate of recidivism much lower, and thus new norms were required which more accurately reflected these trends. (Transcript at 271.) As reported by the creator of the STATIC-99, new data on crime rates lead to a reevaluation of the continued efficacy of the STATIC-99. Dr. Hanson states that

“[s]exual and violent recidivism rates per Static-99 score are significantly lower in our data than they were in the samples used to develop the original Static-99 norms (reported in Harris, Phenix, Hanson, & Thornton, 2003). Even though we have yet to finish our analyses, the evidence is sufficiently strong that we believe the new norms should replace the original norms. Compared to the original norms, the new norms are based on more offenders, more complete data, and more recent, representative samples.” (Hanson, Helmus and Thornton, Reporting Static-99 in Light of New Research on Recidivism Norms, available at http://www.static99.org/ *409pdfdocs/forum_article_feb2009.pdf.)

The significant difference between the “old” norms and the “new” norms is that while previously any given score related to only one rate of recidivism for any given five-year period, the new norms now indicate a range of recidivism for each five-year period. There are now two separate rates of recidivism for each risk group: one for “routine” offenders, called the Correctional Services of Canada (CSC), offenders, and one for “pre-selected high risk” offenders. This high risk group is defined as “offenders who had been judged by some administrative or decision-making body or tribunal to be of sufficiently high risk to warrant exceptional measures (e.g., treatment order, preventive or indefinite detention, denial of statutory release).” (Id.) The findings based on this new data lead to a fundamental reformulation of the manner in which the results of the STATIC-99 were required to be evaluated. Again, as reported by Hanson et al.:

“Differences in recidivism within each Static-99 score on the basis of sample type and offender type suggest that evaluators can no longer, in an unqualified way, associate a single Static-99 score with a single recidivism estimate. Instead, each Static-99 score is associated with a range of recidivism estimates, and evaluators must make a separate judgment as to where a particular offender lies within that range. This new conceptualization of recidivism norms forces evaluators to consider factors external to the risk scale. . . [T]he best method of considering these external factors is as yet unknown . . .

“Currently, our recommendation is to report recidivism estimates with the new norms in two stages. The first stage involves reporting an empirically-derived range of recidivism risk . . . The second stage involves making a professional judgment as to where a particular offender is likely to fall within that range. This judgment represents a separate task from reporting the empirical recidivism rates; currently, there is no research to assess how well evaluators are able to make this judgment. Until further research is conducted, however, this professional judgment is unavoidable.” (Id. [emphasis added].)

The new norms consequently require the person performing the test to report a range of risk, and then, using clinical evalu*410ation in a manner not described by Dr. Hanson and his associates, make a professional judgment as to where a particular offender is likely to fall within that range. At the outset, it is noted that this new approach interjects a clinical component into the actuarial test. As Dr. Harris testified, historically, actuarial testing was undertaken in the first instance in an effort to avoid the judgments of individual practitioners as to an individual’s risk of reoffending, since individual judgments were perceived to be highly inaccurate. (Transcript at 24.) Actuarial testing, based on known criteria and separated from clinical judgments, was intended to provide an objective and scientifically-based alternative to clinical judgments. However, with the emergence of the new norms, Dr. Harris does not believe that “the community that uses it yet has kind of figured out what to do with this at this point because . . . this is such a significant change in terms of the STATIC.” (Transcript at 79.) Clinicians are still waiting to hear more information about “why he [Dr. Hanson] now created a two-tiered system” and how to interpret the results. Dr. Harris stated, “it really is troubling that you come up with one score and you get two very different outcomes as to what the risk is to sexually reoffend. So this is brand new.” (Transcript at 79.)

Dr. Hanson provides no guidelines or methodology in his explanatory materials as to how to make the clinical assessment of where, on the spectrum of risk, an individual lies. Indeed, it appears that the underlying data is still being analyzed by Dr. Hanson, and that further changes may take place. (Testimony of Dr. Katsavdakis, transcript at 278.) As such, each expert postulated a different method. Dr. Harris stated that there is no consensus, and that “all of the changes” call into question the STATIC-99’s “reliability and validity.” (Transcript at 101-104.) Dr. Hamill testified that, generally speaking, he believes that article 10 respondents are, “by definition,” in the “highest risk group” for percentage of recidivism, since they are the type of offenders who have served terms in state prison and have already been screened from the larger population of sex offenders, or have already been identified by treatment in mental health facilities. (Transcript at 361.) He opined that few experts in the community actually knew about the new norms, or had employed them. (Transcript at 362.) He suggested that clinicians are still waiting to hear what the “research tells us about how to divide [respondents] into two groups and what the rules are” (transcript at 358), and he further acknowledged that there *411was no consensus in the community about how that should be done. (Transcript at 361-362.)

Dr. Katsavdakis stated that the new STATIC-99 was still scientifically acceptable as an ARA, because the underlying static risk factors had not changed, but admitted that its “predictive accuracy had decreased.” (Transcript at 287.) Dr. Katsavdakis also testified that he would need to obtain more data from Dr. Hanson and his associates on the demographics of the two samples to determine whether an individual shares more characteristics of one group over the other.

Dr. Siegel said that he did not know how he would use the new norms, and had not seen the underlying data in order to reach a conclusion as to what constitutes a “high risk” individual.

Dr. Scroppo informed the court that there were actually now four sets of norms, the CSC, the high risk, a group that did not fit in either category, and then the combined aggregate of all three groups. He has been using the norms for the combined group which Dr. Hanson initially released toward the end of 2008. (Transcript at 537.) He believes this a valid methodology especially where a distinction cannot be made between the CSC and high risk groups due to a lack of sufficient information, for example, concerning a respondent’s progress in prison by way of prison tickets, programs or treatment. (Transcript at 413.) Presumably, then, Dr. Scroppo is indicating that clinicians will be required to interject the dynamic factors discussed above in employing the new norms.

Illustrating the difficulty in employing the STATIC-99 under the new norms is the fact that in this particular case, the significance of Mr. Rosado’s STATIC-99 score changed three times in six months. The respondent’s score of 4 translates into the following:

[[Image here]]

*412It should be noted that the “old” STATIC-99 measured the rate of reconviction and was understood as underreporting the actual risk of reoffending. The “new” STATIC-99 is now a blend of data based on two outcome variables, arrest rates and reconviction rates. (Testimony of Dr. Scroppo, transcript at 477; testimony of Dr. Siegel, transcript at 720, 728.) In this regard, Dr. Harris testified that “we don’t know how [Dr. Hanson] does it” and so, in essence, “it’s really impossible now to really understand” what this means (transcript at 101-102). Dr. Scroppo testified that he was not aware of any other ARAs that mix both sets of data. (Transcript at 540.)

It seems apparent, in view of the lack of direction or instruction, and the resulting disagreement as to how the new norms should be employed, that there can be no general acceptance in the relevant scientific community of the validity of the new norms. There was no testimony that the new norms were “peer reviewed” or that a hybrid actuarial-clinical assessment instrument is accepted as valid in the relevant scientific community. Indeed, there was testimony that there must be analysis and testing by other scientists to validate the new norms. (Testimony of Dr. Scroppo, transcript at 482.)

This court has also considered the decision in State of New Hampshire v Ploof (No. 07-E-0238, Super Ct, Hillsborough, SS, Northern Dist, available at http://www.static99.org/pdfdocs/ daubert-order4-28-09.pdf), in which the court examined the “new” STATIC-99 norms under a New Hampshire statute which, in essence, codified the factors set forth in Daubert v Merrell Dow Pharmaceuticals, Inc. (509 US 579, 589-590 [1993]). The New Hampshire court concluded, following a hearing, that the “old” norms are no longer accepted as reliable, as the authors of the STATIC-99 reported that the “old” norms should not be used. In addition, the New Hampshire court held that the “new” STATIC-99 norms could be employed to the extent that an expert would be permitted to testify as to the range presented between the CSC group and the high-risk group. For example, the court would permit an expert to testify that the respondent Rosado’s risk is from 7.7% to 19.1% after five years postrelease. The New Hampshire court would not, however, permit any testimony as to where an individual respondent fell on the range presented by the new norms, as any clinical judgment in that regard would not be based on reliable principles and methods.

While this court similarly concludes, albeit under a Frye analysis, that the use of a clinical component to place a respondent *413within a wide range risk has not been shown to be scientifically accepted, the expert evidence placed before this court did not support the conclusion, reached in the Ploof case, that it is acceptable to report just the range under the “new” STATIC-99 norms without the exercise of clinical judgment. First, Dr. Hanson reports that using clinical judgment to particularize the risk is “unavoidable” and indicates that he expects to provide some future guidance. Second, none of the five experts who testified at the hearing endorsed the method of reporting the range of risk without the clinical assessment. As Dr. Scroppo testified, “in terms of the procedure, I think the procedure is defined by the test maker and you can’t come up with your own procedure.” (Transcript at 482.) Third, it has not been shown that reporting a range and thereby avoiding a clinical judgment is a methodology which is generally accepted in the relevant scientific community. Fourth, there was no testimony that the new percentages are statistically valid, i.e., that the accuracy of the numbers has been accepted by the scientific community following peer review of the raw data. Fifth, it was clear that some experts are using yet another option, namely, the combined aggregate norms that were released in late 2008, and although it is quite possible that this approach may be the one which ultimately prevails in the scientific community, there was no testimony indicating that the combined norms were more or less reliable than the bifurcated norms. Sixth, there was no testimony concerning the acceptance by the scientific community of the mixing of two outcome variables, both reconviction and reoffense.

The court is quite mindful that the new norms were not the primary focus of this hearing, and did not even come to light until after the hearing had commenced. Quite possibly, at a hearing in which the new norms were placed squarely in issue, the parties may be able to adduce more cogent evidence as to the reaction of the scientific community to these new developments as well as address some of the six issues highlighted by the court which were left unresolved in this hearing. Additionally, as the import of the new norms becomes more widely understood in the scientific community, it is possible — perhaps likely — that a consensus will emerge. Therefore, the court must point out that the determination it makes as to the new norms is not intended to preclude other approaches to the subject. Rather, the court’s conclusions are limited by the evidence before it and the fact that a response by the relevant community is quite likely still being formulated.

*414As appears from the hearing testimony and the report issued by Dr. Hanson himself, the manner in which the new norms should be employed, and in particular, the method by which the evaluator is to determine where on the range an individual is to be placed, is not understood by experts. In view of this lack of understanding by experts, there can be no general acceptance of the application of the new norms. Moreover, the hearing testimony did not establish that any literature exists showing a general acceptance of the apparently new methodology of employing dynamic or other factors to alter ARA results. Indeed, as the Attorney General’s Office conceded, the STATIC-99 is “coming under question as to its validity” for its intended purpose in light of these new norms. (Petitioner’s affirmation, dated Apr. 24, 2009, at 29.) In view of the novel, untested, and uncertain aspects of the present norms, based on this record they cannot pass muster under Frye analysis (although, as noted above, further evidence may be adduced in this or other article 10 proceedings as to scientific acceptance of the new norms). It is noted that no one questions the underlying methodology of the STATIC-99 which generates a raw score, or the system of assigning a relative risk (low, moderate, medium-high, or high) based on the raw score. Thus, testimony as to the raw score and the relative risk, low, medium or high, is presumptively admissible, not at trial, but at the dispositional phase.

The Prejudicial Effect of Admission of STATIC-99 Results

Lastly, under a Frye analysis, even if “technically relevant,” the evidence can be precluded “if its probative value is substantially outweighed by the danger that it will unfairly prejudice the other side or mislead the jury.” (Scarola, 71 NY2d at 777.) In the immediate case, respondent’s STATIC-99 score of “4” is considered to be a medium-high risk of reoffending. Under the “new” norms, assuming future admission under Frye, that risk percentage varies from 8.2 % to 27.3% over 10 years. Interestingly, respondent’s score of “4” — clearly not low, however not squarely high — provides the perfect example of the pitfalls of admitting the STATIC-99 at a jury trial. The testimony could arguably benefit respondent if, for example, the jury believed that the risk was in fact closer to 8% but could be greatly prejudicial to respondent if the jury accepted testimony that the risk was in fact closer to 27%. However, as discussed above the score of “4” has at best marginal relevance to the issue of whether respondent has a mental abnormality in the first instance, since the score only means that respondent shares certain characteristics of a group found to reoffend at a certain *415rate without telling the jury anything about respondent’s volitional capacity. Given the tenuous connection between a STATIC-99 score and volitional capacity, a jury could easily be confused by the evidence and give it undue significance in either direction. In this case, a jury, believing this score to be low, might wrongfully conclude that respondent has no mental abnormality when in fact one might exist, even if ultimately confinement might not be required due to his “lower” risk of reoffending.

On the other hand, since a score of “4” can arguably be considered a medium-high risk of reoffending, there is an equal chance that the jury could conclude that respondent has a mental abnormality based on that evidence alone. Indeed any high STATIC-99 score — indicating that a respondent is dangerous and likely to reoffend — could compel a jury to conclude unfairly that a respondent has a mental abnormality when one might not exist. Dr. Siegel acknowledged that even clinicians may give undue importance to a high score. (Transcript at 619.)

The New Jersey Superior Court, Appellate Division gave true meaning to this potential for jury confusion in In re Commitment of R.S. (339 NJ Super 507, 539, 773 A2d 72, 91 [2001], affd 173 NJ 134, 801 A2d 219 [2002]), when the court observed, after finding that actuarial risk assessment instruments pass muster under Frye with respect to the issue of future dangerousness, that

“SVPA commitment hearings are tried before a judge, not a jury. The court understands that it is the ultimate decision maker and must reach a conclusion based upon all of the relevant evidence ‘psychiatric or otherwise — according each type such weight as [it] see[s] fit.’ State v. Fields, 77 N.J. 282, 308, 390 A.2d 574 (1978). An experienced judge who is well-informed as to the character of the actuarial instruments and who is accustomed to dealing with them is much less likely to be prejudiced by their admission than a one-case, fact-finding jury would be.”

In New York, unlike in New Jersey, the issue of mental abnormality is tried before a jury, unless a jury is waived. Even if this testimony were in some way relevant on the issue of mental abnormality, the difficulty of presenting the subtle concepts required to understand the test and its operation, the marginal probative value of the test for predicting recidivism, and the risk that the jury will be confused or mislead as to the significance of the STATIC-99 results, warrants exclusion from the trial evidence.

*416Collateral Estoppel

Respondent argues that a decision by another justice of this court, in a wholly unrelated article 10 proceeding involving a different respondent, denying petitioner’s motion to exclude the STATIC-99 collaterally estopps petitioner from contesting the introduction of the STATIC-99 results at trial in this proceeding. Respondent does not argue that collateral estoppel bars the present Frye analysis, since none was undertaken in connection with the other proceeding. This court is not bound by the decision of a justice of coordinate jurisdiction (Mountain View Coach Lines v Storms, 102 AD2d 663 [2d Dept 1984]), and it does not appear that collateral estoppel can be applied to preclude the assertion of a legal argument in one proceeding which was rejected in another proceeding. Assuming, without deciding, that a determination on a motion in limine in an unrelated proceeding constitutes a final determination for the purpose of collateral estoppel, the issues raised on the present motion (other than those related to the Frye analysis) concern the statutory interpretation of article 10 and the relevance of evidence generally under the statute. These are primarily issues of law, to which the doctrine of collateral estoppel is not applicable. (See Matter of Oneida Indian Nation of N.Y. v Pifer, 43 AD3d 579 [3d Dept 2007]; Sterling Natl. Bank v Eastern Shipping Worldwide, Inc., 35 AD3d 222 [1st Dept 2006].)

Conclusion

The STATIC-99 is not scientifically accepted, under Frye, as a tool to inform decision-makers as to the existence of a mental abnormality as defined under Mental Hygiene Law article 10, as it does not measure any of the elements of that definition. In addition, under New York State’s unique bifurcated civil management proceedings, testimony of recidivism is not relevant at the trial phase. Lastly, in view of the recent development of the new norms, and an entirely new and undeveloped methodology for applying those norms, it cannot be said that the new norms of the STATIC-99 (despite its past acceptance) are now sufficiently understood and accepted in the relevant scientific community under Frye, even for the purpose for which it was created — namely, to quantify risk. The motion in limine is granted, and the testimony concerning respondent’s STATIC-99 score is excluded from the trial testimony.

Related Cases