Smith v. Schriro

Opinion by Judge REINHARDT; Concurrence by Judge SCHROEDER; Special Concurrence by Judge REINHARDT; Dissent by Judge CALLAHAN.

OPINION

REINHARDT, Circuit Judge:1

This case, to which the Antiterrorism and Effective Death Penalty Act of 1996 (AEDPA) does not apply, returns to us following remand to the Arizona state *1178court to conduct an Atkins evidentiary-hearing. After that hearing the state trial court denied Smith’s Atkins claim, and the Arizona Court of Appeal denied special action relief, and the Arizona Supreme Court denied Smith’s petition for review. The district court then found Smith’s Atkins claim without merit and denied his petition for a writ of habeas corpus. We now hold that Smith is intellectually disabled under Atkins, and we reverse.2

I. FACTUAL AND PROCEDURAL BACKGROUND3

In 1982, Robert Smith was convicted in Arizona state court of kidnapping, sexual assault, and murder and sentenced to death. Lambright v. Stewart, 167 F.3d 477, 479 (9th Cir.1999), reh’g granted, vacated, 177 F.3d 901 (9th Cir.1999), rev’d, en banc, 191 F.3d 1181 (9th Cir.1999). On June 20, 2002, the Supreme Court decided Atkins v. Virginia, 536 U.S. 304, 122 S.Ct. 2242, 153 L.Ed.2d 335 (2002), holding that the execution of intellectually disabled criminals constitutes “cruel and unusual punishment” prohibited by the Eighth Amendment.4 Under Atkins, if Smith was intellectually disabled at the time he committed the crime or at the time of his trial, he may not be executed. We suspended federal habeas proceedings, ordered supplemental briefing and remanded to the state court to determine whether Smith was intellectually disabled and thus ineligible for execution under Atkins.

The Pima County Superior Court reopened discovery and held a two-day ev-identiary hearing on October 29 and November 1, 2007. The court heard testimony by Dr. Thomas Thompson, a neuropsychologist and prescribing psychologist selected by Smith, who opined that there is a very high probability that Smith was intellectually disabled at the time the crime was committed in 1980. The court also heard testimony from Dr. Sergio Martinez, a psychologist selected by the State, who stated that there is a high degree of probability that Smith was not intellectually disabled in 1980. The parties entered numerous exhibits into evidence, including the deposition transcripts of twelve lay witnesses who described their observations of Smith as a child or young adult.

Following the hearing, the Pima County Superior Court found on March 27, 2008, that Atkins did not preclude Smith’s execution. The Arizona Court of Appeals denied special action relief later that year, Smith v. Kearney, No. 2 CA-SA 2008-0019, 2008 WL 2721155 (Ariz.Ct.App. July 11, 2008), and the Arizona Supreme Court denied Smith’s petition for review. In September 2010, we remanded this case to the district court for the limited purpose of considering Smith’s Atkins claim. The district court denied the claim in December 2012. Smith timely appealed.

II. ANALYSIS

A. Jurisdiction and Standard of Review

We have jurisdiction under 28 U.S.C. §§ 1291 and 2253. Sivak v. Hardison, 658 F.3d 898, 905 (9th Cir.2011). We *1179review de novo the federal district court decision denying Smith’s 28 U.S.C. § 2254 habeas petition. Alvarado v. Hill, 252 F.3d 1066, 1068 (9th Cir.2001).

Because Smith filed his federal habeas petition prior to AEDPA’s April 24, 1996 effective date, pre-AEDPA standards govern our review even though Smith filed amended petitions subsequent to AEDPA’s effective date. See Sivak, 658 F.3d at 905 (applying the pre-AEDPA standard of review where initial petition was filed prior to AEDPA’s effective date and amended petitions were filed following AEDPA’s enactment); Robinson v. Schriro, 595 F.3d 1086, 1099 (9th Cir.2010) (same); see also Lindh v. Murphy, 521 U.S. 320, 326, 117 S.Ct. 2059,138 L.Ed.2d 481 (1997) (holding that Congress intended AEDPA to apply “only to such cases as were filed after [AEDPA’s] enactment”).

Under pre-AEDPA law, state court factual findings are entitled to a presumption of correctness, subject to eight exceptions enumerated in the previous version of 28 U.S.C. § 2254(d). Sivak, 658 F.3d at 905-06. Among the exceptions to the rule regarding a presumption of correctness is the following: the state court’s “factual determination is not fairly supported by the record.” 28 U.S.C. § 2254(d)(8). Because the parties agree that whether Smith is intellectually disabled is a question of fact, we assume for purposes of this opinion that such is the case.5 The presumption of the correctness also does not apply if the factual determination is based on the application of constitutionally impermissible legal principles. Lafferty v. Cook, 949 F.2d 1546, 1551 n. 4 (10th Cir.1991).

B. Legal Standard Governing Determination of Intellectual Disability Under Arizona Law

In 2001, one year before Atkins was decided, the Arizona legislature enacted a statute prohibiting the execution of intellectually disabled persons and creating a process by which capital defendants are evaluated for intellectual disability. Ariz. Rev.Stat. Ann. § 13-703.02 (2001), 2001 Ariz. Sess. Laws, Ch. 260, § 2; State v. Grell (Grell I), 205 Ariz. 57, 66 P.3d 1234, 1240 (2003). Under the version of the statute in effect at the time of Smith’s Atkins hearing in 2007, the procedures for evaluating a defendant were automatically triggered upon the State’s filing a notice of intent to seek the death penalty. Ariz. Rev.Stat. Ann. § 13-703.02(B) (2006), as amended by 2006 Ariz. Sess. Laws, Ch. 55, § l.6 The statute provides that the burden of proving intellectual disability lies with the capital defendant who must prove his disability by “clear and convincing evi-*1180denee.” Ariz.Rev.Stat. Ann. § 13-703.02(G).

The Arizona statute defines “mental retardation” as containing three elements: (1) “significantly subaverage general intellectual functioning” and (2) concurrent “significant impairment in adaptive behavior,” (3) “where the onset of the foregoing conditions occurred before the defendant reached the age of eighteen.” Ariz.Rev. Stat. Ann. § 13-703.02(K)(3). “Significantly subaverage general intellectual functioning” is defined as “a full scale intelligence quotient [IQ] of seventy or lower.” Ariz.Rev.Stat. Ann. § 13-703.02(K)(5). “Adaptive behavior” is defined as “the effectiveness or degree to which the defendant meets the standards of personal independence and social responsibility expected of the defendant’s age and cultural group.” Ariz.Rev.Stat. Ann. 13-703.Q2(K)(1).

Under Arizona’s procedures for determining intellectual disability, the court appoints a prescreening psychological expert to determine the defendant’s IQ “using current community, nationally and culturally accepted intelligence testing procedures.” Ariz.Rev.Stat. Ann. § 13-703.02(B). If the expert determines that the defendant’s IQ is above 75, “the notice of intent to seek the death penalty shall not be dismissed on the ground that the defendant has mental retardation.” Ariz. Rev.Stat. Ann. § 13-703.02(C). If the IQ score is 75 or less, however, the court will appoint additional experts in consultation with the parties to prepare reports regarding whether the defendant is intellectually disabled. Ariz.Rev.Stat. Ann. § 13-703.02(D), (E). If at this point all IQ test scores are above 70, the defendant remains eligible for the death penalty. Ariz.Rev. Stat. Ann. § 13-703.02(F).

If the testing demonstrates that the defendant’s IQ score is equal to or less than 70, however, the court holds a hearing at which “the defendant has the burden of proving mental retardation by clear and convincing evidence.” Ariz.Rev.Stat. Ann. § 13-703.02(G). Under Arizona law, “[e]lear and convincing evidence is that which may persuade that the truth of the contention is ‘highly probable.’ ” In Re Ne-ville, 147 Ariz. 106, 708 P.2d 1297, 1302 (1985) (en banc). A determination by the court that the defendant’s IQ is 65 or below “establishes a rebuttable presumption that the defendant has mental retardation.” Ariz.Rev.Stat. Ann. § 13-703.02(G). However, “ ‘[t]he presumption of mental retardation based on the IQ scores vanishes ... if the State presents evidence that calls into question the validity of the IQ scores or tends to establish that [the] defendant does not otherwise meet the statutory definition of mental retardation.’ ” State v. Boyston, 231 Ariz. 539, 298 P.3d 887, 895 (2013) (quoting State v. Arellano, 213 Ariz. 474, 143 P.3d 1015, 1019 (2006)); see Arellano, 143 P.3d at 1018 (“A rebuttable presumption, however, ‘vanishes when the state provides contradictory evidence.’ ” (citation omitted)). “ ‘At that point, the IQ scores serve as evidence of mental retardation, to be considered by the trial court with all other evidence presented.’ ” Boyston, 298 P.3d at 895 (quoting Arellano, 143 P.3d at 1019).

Smith did not have the benefit of this procedural framework at the time of his trial because the trial took place nearly twenty years before the procedural framework’s adoption. The Arizona Supreme Court has held that in cases presenting Atkins claims in such a post-trial posture, courts should use Atkins as a guide and apply the pre-trial procedures of § 13-703.02 to the extent practical. As the Arizona Supreme Court explained in a capital case predating the passage of § 13-703.02,

We recognize that the procedures set forth in section 13-703.02 are not applicable in Grell’s case, as section 13-703.02 did not take effect until after Grell’s *1181sentencing. Moreover, the procedures contemplated by section 13703.02 are pre-trial procedures, triggered when the State files its notice of intent to seek the death penalty. The trial court should use Atkins as a guide and should, insofar as is practical in the post-trial posture of this case, follow the procedures established in [Ariz.Rev.Stat.] section 13-703.02.

Grell I, 66 P.3d at 1241 (footnote omitted); accord Arellano, 143 P.3d at 1017 (“[Ariz. Rev.Stat. § 13-703.02] applies to all capital sentencing proceedings, including post-conviction proceedings brought to determine whether a defendant meets the statutory definition of mental retardation.”).

C. Presumption of Correctness

As an initial matter, we must determine whether a presumption of correctness applies to the state court’s factual determination that Smith was not intellectually disabled at the time of the offense and trial. We conclude that it does not. As discussed in section II.C.l, we hold that the state court’s factual determination is not entitled to deference because it is “not fairly supported by the record.” 28 U.S.C. § 2254(d)(8). Also, as explained in section II.C.2, Judge Reinhardt would hold that deference is not due for the additional and independent reason that the Pima County Superior Court rendered its finding that Smith was not intellectually disabled under a constitutionally impermissible legal standard.

1. The State Court’s Factual Determination Is Not Fairly Supported By the Record

Our case law provides some guidance for determining when the exception codified at § 2254(d)(8) applies. Where the record is ambiguous, a state court’s factual determination is “fairly supported by the record” within the meaning of § 2254(d)(8). Palmer v. Estelle, 985 F.2d 456, 459 (9th Cir.1993); see Wainwright v. Goode, 464 U.S. 78, 85, 104 S.Ct. 378, 78 L.Ed.2d 187 (1983). Where the great majority of the evidence strongly points against the state court’s finding, however, the finding is not fairly supported. We have held that a factual determination is not fairly supported by the record even if it is supported by some evidence and other evidence is equally consistent with both the state court’s conclusion and a contrary conclusion, so long as the record as a whole “strongly suggests” a different conclusion. See Carriger v. Stewart, 132 F.3d 463, 473-76 (9th Cir.1997).

This standard must also be read in the context of the Supreme Court’s recent decision in Hall v. Florida, — U.S. —, 134 S.Ct. 1986, 188 L.Ed.2d 1007 (2014). In Hall, the Court emphasized that, in death penalty cases where a defendant’s intellectual functioning is a close question, the defendant “must be able to present additional evidence of intellectual disabili-ty____” Id. at 2001. In fact, in these situations, the court must not “view a single factor as dispositive” given the complexity of intellectual disability assessments. Id. Therefore, a court reviewing the whole record as required by the standard at issue must consider all indications of a defendant’s intellectual disability and may not discard relevant evidence.

Here, we do not defer to the state court’s ultimate conclusion that Smith was not intellectually disabled because it lacks fair support in the record as a whole.7 *1182Nor do we defer to the state court’s weighing of the evidence where its decisions to discount certain evidence similarly lack fair evidentiary support or result from legal error.8 The evidence in this case overwhelmingly supports our conclusion that Smith satisfied both substantive prongs of intellectual disability — significantly subav-erage general intellectual functioning and significant impairment in adaptive behavior — both prior to age eighteen and at the time of the crime.

a. Application of Atkins

The state trial court correctly concentrated its analysis on whether Smith was intellectually disabled at the time of the offense and the ensuing trial. In Atkins, the Court identified two rationales supporting its holding. First, concentrating on the time of the offense, the Court recognized that intellectually disabled offenders are less culpable for their crimes. Atkins, 536 U.S. at 317, 122 S.Ct. 2242; see also Hall, 134 S.Ct. at 1992-93. Specifically, the Court noted that there is reason to doubt whether either justification it had previously recognized as a basis for the death penalty — retribution and deterrence — applies to intellectually disabled offenders. Atkins, 536 U.S. at 318-19, 122 S.Ct. 2242. These individuals, the Court explained, suffer from impairments leaving them with “diminished capacities to understand and process information, to communicate, to abstract from mistakes and learn from experience, to engage in logical reasoning, to control impulses, and to understand the reactions of others,” making them more likely to act on impulse rather than premeditation, and as followers rather than leaders. Id. at 318,122 S.Ct. 2242. These limitations diminish the individual’s relative culpability for the crime, and, consequently, the retributive justification of the death penalty. See id. at 319, 122 S.Ct. 2242 (“[T]he severity of the appropriate punishment necessarily depends on the culpability of the offender.”); see also Hall, 134 S.Ct. at 1992 (“No legitimate penological purpose is served by executing a person with intellectual disability.”). They likewise limit the death penalty’s deterrent effect, because these impairments “also make it less likely that [intellectually disabled offenders] can process the information of the possibility of execution as a penalty and, as a result, control their conduct based upon that information.” Atkins, 536 U.S. at 320,122 S.Ct. 2242.

The Court’s second rationale concentrates on a defendant’s trial in light of the heightened risk that “[m]entally retarded defendants in the aggregate face a special risk of wrongful execution” because they are less able to effectively participate in their own defense for the purpose of making “a persuasive showing of mitigation.” 9 Id. at 320-21, 122 S.Ct. 2242; see *1183also Hall, 134 S.Ct. at 1993. . Because the rationales underlying the right announced in Atkins concentrate on the time the crime was committed and the ensuing trial, we hold that a defendant comes within the protection of Atkins if he can demonstrate that he was intellectually disabled during either of these periods.10 Consequently, a defendant’s present condition is relevant only to the extent that it is probative of his condition during the relevant periods.

The defendant must, of course, qualify under the third prong as well. The onset of the mental disability must have occurred before he reached the age of eighteen.

We turn now to why the record does not fairly support the state court’s determination that Smith was not intellectually disabled. In order to do so, we must examine the evidence under the two substantive elements of the Arizona statute, and determine whether the evidence as a whole strongly points to the conclusion that the two statutory conditions existed at the time of the crime or trial, and whether the onset of each condition occurred prior to age eighteen.

b. Significantly Subaverage General Intellectual Functioning

“ ‘Significantly subaverage general intellectual functioning’ is the touchstone for proving [intellectual disability] and means ‘a full scale intelligence quotient [IQ] of seventy or lower.’” State v. Grell (Grell III), 231 Ariz. 153, 291 P.3d 350, 352 (2013) (quoting Ariz.Rev.Stat. Ann. § 13-753(E)(5)).11 It must be manifested before age eighteen, Ariz.Rev.Stat. Ann. § 13-703.02(E)(3), and at the time of the crime or trial, see Atkins, 536 U.S. at 317-21, 122 S.Ct. 2242.

1. Intellectual Functioning Prior to Age Eighteen

Smith took the Otis Intelligence Scale Test in April 1964 and again in October of that year, when he was fifteen years old, receiving scores of 62 and 71, respectively. The score of 62 Smith received the first time he took the test is the more relevant of the two scores in light of Dr. Thompson’s unrebutted testimony that Smith’s second test score of 71 was inflated by the practice effect of having taken the same test just several months earlier. Dr. Thompson explained that under the practice effect, a person scores higher on a test when it is readministered within a short period of time because he has become familiar with the test. Arizona courts and the most current clinical guidelines recognize the practice effect. See State ex rel. Thomas v. Duncan, 222 Ariz. 448, 216 P.3d 1194, 1195 n. 4 (App.2009) (“The practice effect occurs when a person performs better on a test because he or she has taken it before.”); id. at 1198 (stating that “a defendant may argue that the practice effect impacted the results” of successive IQ tests); Am. Ass’n of Intellectual and Developmental Disabilities, Intellectual Disability 38 (11th ed.2010) [hereinafter AAIDD 11th ed.] (describing research showing the artificial increase in IQ scores *1184when the same instrument is readminis-tered within a short time interval, and stating that established clinical practice is to avoid administering the same intelligence test within the same year to the same individual because it will often lead to an overestimation of the examinee’s true intelligence); Am. Psychiatric Ass’n, Diagnostic and Statistical Manual of Mental Disorders 37 (5th ed.2013) [hereinafter DSM-V] (identifying the practice effect as a factor capable of affecting test scores).

Under the Atkins framework Arizona later adopted, Smith’s IQ score of 62 would entitle him to a presumption of intellectual disability. See Ariz.Rev.Stat. Ann. § 13-703.02(G). The State, however, “presented] evidence that calls into question the validity of the IQ scores or tends to establish that [the] defendant does not otherwise meet the statutory definition of mental retardation.” Boyston, 298 P.3d at 895 (quoting Arellano, 143 P.3d at 1019). Specifically, the State points to the results of IQ test scores administered by Drs. Thompson and Martinez in 2005 and 2007, on which Smith received scores of 89, 91, and 93. These scores demonstrate that Smith is not presently intellectually disabled, and, in the absence of IQ scores documenting Smith’s IQ at the time the crime was committed, raise an inference that he may not have been disabled at that time.12 Accordingly, “[t]he presumption ... based on the IQ scores vanishes” and we weigh the evidence as if no presumption had existed. Id. (quoting Arellano, 143 P.3d at 1019).13

In any event, the Otis test scores remain highly probative of Smith’s condition prior to age eighteen. The State asserts that the tests are unreliable, and points to Dr. Thompson’s testimony on cross-examination that by 1964 the Otis tests had not been “normed” against the current population for forty years, and that he had not seen the raw data from Smith’s Otis tests or any information regarding the conditions under which those tests were administered. Although the lack of contemporary norming may call into some question the accuracy of the test results, Dr. Thompson gave uncontroverted testimony that, due to the Flynn Effect, this would only have caused Smith’s scores to be overstated. The basic premise of the Flynn effect is that because average IQ scores increase over time, a person who takes an IQ test that has not recently been normed against a representative sample of the population will receive an artificially inflated IQ score. See James R. Flynn, Tethering the Elephant: Capital Cases, IQ, and the Flynn Effect, 12 Psychol. Pub. Pol’y & L. 170, 173 (2006) [hereinafter Flynn Effect ]. This is because IQ scores are based on a normal distribution curve, and thus an individual’s score is meaningful only in relation to the scores of the other people who took the same test. See J.C. Oleson, The Insanity of Genius: *1185Criminal Culpability and Right-Tail Psy-chometrics, 16 Geo. Mason L.Rev. 587, 598 (2009). When correcting for the Flynn Effect, “[t]he standard practice is to deduct 0.3 IQ points per year (3 points per decade) to cover the period between the year the test was normed and the year in which the subject took the test.” Flynn Effect, supra, at 173. The AAIDD recognizes the existence of the Flynn Effect and recommends correcting for the age of norms in outdated tests. AAIDD 11th ed., supra, at 37; see also Am. Ass’n on Mental Retardation, Mental Retardation: Definition, Classification, and Systems of Supports 56 (10th ed.2002) [hereinafter AAMR 10th ed.]. The Fourth and Eleventh Circuits have also recognized the existence of the Flynn Effect. See Walker v. True, 399 F.3d 315, 322-23 (4th Cir.2005) (reversing district court due to its failure to consider “relevant evidence” of the Flynn effect); Holladay v. Allen, 555 F.3d 1346, 1358 (11th Cir.2009) (“[A]ll of the scores were on WAIS tests, which may have reflected elevated scores because of the Flynn effect.”).14 Without referring to the Flynn effect by name, we too have adjusted IQ scores based on out-of-date norms. Gregory E, 811 F.2d at 1312 n. 2. Here, we conclude that, in light of Dr. Thompson’s uncontroverted testimony regarding the impact of the Flynn Effect, Smith’s score of 62 on the outdated Otis test renders it highly probable that his IQ at the time of the test was lower than 62, well below the cutoff for demonstrating “significantly sub-average general intellectual functioning” under Arizona law.

The record does not fairly support the state court’s determination to afford the Otis test little weight and discount Dr. Thompson’s opinion to the extent that he relied on the test.15 Although Dr. Thompson and Dr. Martinez each noted that additional information regarding the administration of the test would enhance its validity, neither witness concluded that *1186the test results were invalid in the absence of such information. More fundamentally, we decline to disregard the Otis tests on the basis that Smith is unable to proffer the same level of detailed evidence regarding their administration as is available for recent tests administered by court-appointed psychologists. Like most states, Arizona places the burden on a defendant raising an Atkins claim to demonstrate, inter alia, significantly subaver-age general intellectual functioning [meaning an IQ of 70 or below] occurring before the age of eighteen. Ariz.Rev.Stat. Ann. § 13-703.02(K)(3), (5). It is highly unlikely, however, that the people administering an IQ test to a child would ever anticipate the use of that test in an Atkins proceeding, and at the time of Smith’s tests the constitutional right provided by Atkins did not even exist. Consequently, records of childhood IQ tests will rarely include the detailed information collected for IQ tests administered under court supervision to adjudicate a defendant’s Atkins claim. To discount what may be the only evidence of subaverage general intellectual functioning prior to age eighteen on this ground would effectively deny the protection afforded by Atkins to individuals who are substantially older than eighteen years old, or whose trials predate Atkins, because it would render their intellectual disability nearly impossible to prove. Given the evidentiary challenges so often arising from the retrospective nature of Atkins claims, the Eighth Amendment requires that courts apply a more relaxed standard when determining the reliability of evidence documenting childhood onset of intellectual disability. Here, there is no indication that Smith was malingering when he took the Otis tests. Accordingly, although they do not provide a presumption of intellectual disability under Arizona law, we find that Smith’s first Otis test score nonetheless “serye[s] as evidence of mental retardation, to be considered by [this] [C]ourt with all other evidence presented.’ ” Boy-stem, 298 P.3d at 895 (internal quotation marks omitted).

We hold the first Otis test score reliable for the additional reason that it is consistent with Smith’s contemporaneous poor academic performance. Under Arizona law, evidence of poor academic performance is evidence of subaverage intellectual functioning. Williams v. Cahill ex rel. Cnty. of Pima, 232 Ariz. 221, 303 P.3d 532, 540 (App.2013) (“[W]hen no childhood IQ tests were performed, subaverage intellectual functioning before the age of eighteen properly may be inferred from other evidence of intellectual functioning, such as school performance.”). Here, the evidence overwhelmingly demonstrates that Smith performed exceedingly poorly in school, scoring in the 2nd to 5th percentiles on the Stanford Achievement Test at age fifteen, placing him five to seven years below his age level and three to five years below grade level. Smith’s school transcripts reveal that he received nearly all “Ds” and “F’s” in his academic subjects, and that his education did not progress beyond the eighth grade, after which he dropped out of school. The State does not contest the validity of these records. Melva Jane Box, Smith’s older sister, testified that Smith was held back in all his grades, was placed in special education class for slow learners, and was even transferred to a special school “because he couldn’t learn.” Charles Caperton, one of Smith’s childhood neighbors, similarly testified that Smith was placed in special education classes. Taken together, Smith’s Otis test scores and poor academic performance overwhelmingly demonstrate that Smith experienced significantly subaverage general intellectual functioning prior to the age of eighteen. The state court’s determination to the contrary does not find fair support in the record.

*11872. Intellectual Functioning at the Time of the Crime and Trial

The more fundamental question in this case is whether Smith continued to suffer from subaverage intellectual functioning at the time of the crime and trial. The only evidence to the contrary is the IQ test scores conducted by Dr. Thompson and Dr. Martinez decades after the trial. Thus, the question is the relative weight that can fairly be given to the pre-crime and post-crime test scores insofar as they provide evidence determinative of Smith’s intellectual functioning at the time of the crime and trial, and whether the record fairly supports the state court’s conclusion that Smith did not experience significantly subaverage general intellectual functioning at that crucial time.

We begin by noting that the subsequent administration of IQ tests by Drs. Thompson and Martinez was substantially more remote from the period of Smith’s crime than the administration of the Otis test scores: twenty-five and twenty-seven years after the crime, in the former case, compared to sixteen years in the latter. Accepting each set of test scores as valid measures of Smith’s IQ at the time the tests were administered, this discrepancy renders more probable Smith’s assertion that his IQ at the time of the crime approximated the IQ reflected in his first Otis test score rather than his more recent, higher scores.

The key issue, however, is the strength of Smith’s evidence demonstrating the probability that his significant gains in IQ score occurred after, rather than before, his incarceration. Dr. Thompson testified that improvements in IQ score similar to those attained by Smith are possible for individuals like Smith whose cognitive problems stem from environmental factors rather than physical injury and who are later given appropriate antidepressant medication and placed in a structured environment. Certainly, Smith adduced substantial evidence of a horribly abusive and impoverished upbringing supporting Dr. Thompson’s opinion: he was routinely brutalized by his stepfather, and was subjected to extreme verbal and emotional abuse by his mother, interspersed with neglect and periods of outright abandonment. According to Box, Smith’s stepfather would beat him with “whatever was closest.... a belt, a stick, a coat hangar,” and also molested him. Martha Gau, Smith’s younger half-sister, similarly testified that Smith’s stepfather would tell him “he was good for nothing and would never amount to anything,” and would kick him and whip him with both ends of a belt; she recalled finding Smith’s bedsheets covered in blood following one particularly serious beating when he was about twelve or thirteen years old. Caperton saw Smith beaten with a belt “pretty regularly,” and witnessed one beating involving use of a two-by-four. Smith’s mother frequently left the children alone at a time when Smith was still young enough to be using a high chair. On one occasion when she was actually present, Smith’s mother engaged in extra-marital foreplay in the front seat of her car while Smith sat in the backseat. On another, after the children failed to adequately clean the dishes, she sent them outside with bowls on their heads to pick weeds from the yard while other children from the neighborhood gathered around them and laughed. As a result of this upbringing, Dr. Thompson opined, Smith became intellectually disabled with frontal lobe abnormalities.16

*1188As evidence that IQ scores can improve following the commission of the crime in situations similar to Smith’s, Dr. Thompson cited robust data demonstrating that the use of antidepressants (which Smith took while incarcerated) can significantly increase brain functioning over time, noted that other death row inmates have attained improved functioning while incarcerated, and provided an anecdote of a patient who achieved a twelve point gain in IQ score after receiving medication for just four months. As evidence that his level of functioning improved over the course of his incarceration, Smith adduced testimony from multiple witnesses describing dramatic improvements in the quality of the letters he sent from prison. Gau testified that letters Smith sent her at the beginning of his incarceration were virtually unintelligible, but that over the following years his writing had improved ‘TOO percent,” explaining that “it was like a totally different person was writing it.” Martha Hight, Smith’s aunt by marriage, similarly described Smith’s early letters from prison as partially unintelligible, and noted improvements in the letters he sent in later years. Smith also received tutoring while in prison. Ronald Labrecque, who worked for the Department of Corrections from 1986 until 1997, supervised Smith’s work on maintenance jobs over an eight year period. Labrecque also tutored Smith, helping him with his reading and providing him reading materials such as working manuals, and described witnessing a “vast improvement” in Smith’s reading ability over this time. Smith received additional help from Ed Schad, a fellow inmate, who would get books for Smith from the prison library and have him read them. None of this evidence is refuted by the State.

For his part, although disagreeing with Dr. Thompson’s ultimate conclusion, Dr. Martinez agreed with several of his key premises. Dr. Martinez testified that significant IQ gains are possible, and acknowledged that Smith’s IQ gains were not without precedent. He also agreed with Dr. Thompson’s characterization of prison as a “structured environment.” More significant, Dr. Martinez testified that improved functioning is unlikely to occur in the absence of training and educational opportunities (which Smith received in prison from Labrecque and Schad), and stated that there was no indication Smith received any such opportunities prior to the time of the crime. This strongly reinforces, and renders highly probable, Smith’s assertion that the improvement in his functioning did not occur until after the crime was committed. For all of these reasons, we hold that the state court’s determination that no evidence explains whether the finding of Smith’s low childhood IQ could be extrapolated to the time of the crime and trial lacks even fair support in the record.

Dr. Martinez also relied on reports summarizing three Rule 11 competency evaluations Smith underwent in 1981, each of which found- Smith competent to stand trial. He specifically cited the conclusion of *1189one evaluator, Dr. LaWall, that Smith “probably functions in the average range of intelligence.” Dr. Thompson described the Rule 11 reports as an unreliable assessment of intelligence because they are “very superficial,” and “very subjective.” He explained that because the reports focus on competency, they comprise estimates of a subject’s functioning based only on a brief interview, involve little review of the subject’s history, and — more important — include no quantitative assessment of his IQ. Dr. Thompson’s critique is consistent with Arizona law and highly persuasive, and Dr. LaWall’s assessment of Smith’s intelligence carries little weight. See Ariz.Rev.Stat. Ann. § 13-703.02(K)(5) (determining whether an individual suffers from significantly subaverage general intellectual functioning requires a quantitative assessment of IQ). The State adduces no other evidence of improved academic performance or other indicia of increased intellectual functioning prior to the commission of the crime. Accordingly, viewing the record as a whole, we hold that Dr. Martinez’s conclusion is not fairly supported by the record. Because the remaining evidence supporting the state court’s conclusion is minimal, we hold that its conclusion that Smith failed to satisfy the intellectual functioning prong of Arizona’s intellectual disability definition at the of the crime and/or trial is not fairly supported by the record. We hold instead that the evidence overwhelmingly demonstrates that Smith experienced significantly subaverage general intellectual functioning at that dispositive time.17

Moreover, to hold otherwise would contravene the fundamental principles the Supreme Court recently laid out for the benefit of the federal courts and the state judiciary in the landmark case of Hall v. Florida, — U.S.—, 134 S.Ct. 1986, 188 L.Ed.2d 1007 (2014). In that case, the Court made it clear that a determination of *1190intellectual disability requires, at least in questionable cases, the consideration of significant relevant evidence, not simply a measurement of an IQ test at a particular point in time. Here, the premise on which the state court’s decision as to intellectual disability is based is that the IQ tests taken at a critical time — the time prior to Smith’s 18th year — must be discounted in large part because, at the time they were taken, the procedures used for such tests were not adequately recorded and the information regarding the administration of such tests was no longer available. This test evidence was not only critical to the initial IQ determination when Smith was 15 but also to the ultimate determination of his intellectual disability at the time of the crimes. To discount the reliability of the tests on such grounds is little different from failing to consider, or excluding, crucial evidence that is not only highly relevant to the principal questions at issue but is indeed critical to-arriving at a fair and just answer to the question whether Smith is eligible for capital punishment.

Hall reminds us that “the death penalty is the gravest sentence our society may impose,” 134 S.Ct. at 2001, and that imposing this “harshest of punishments on an intellectually disabled person violates his or her inherent dignity as a human being,” id. at 1992. Given these stakes, Hall warns that we must not make judgments in haste as to whether a person has an intellectual disability, but rather must consider all the “substantial and weighty evidence” in cases that present close questions. Id. at 1994. Put differently, we cannot risk making the protections of Atkins a nullity by executing a person'with an intellectual disability without giving him the “fair opportunity to show the Constitution prohibits [his] execution.” Id. at 2001. The state court’s decision in Smith’s case takes that risk. By discounting Smith’s early IQ tests even though they were the type used at the time and even though they are the most likely evidence that an intellectually disabled defendant of Smith’s age could present in order to prove his condition, the state court judge rendered the protection for the intellectually disabled established by Atkins effectively meaningless, which is precisely what the Court sought to avoid in Hall. Such a decision, by removing highly probative evidence of a person’s intellectual disability in a death-penalty case, not only violates the individual defendant’s right but also' “contravenes our Nation’s commitment to dig- ■ nity and its duty to teach human decency as the mark of a civilized world.” Id.

c. Significantly Impaired Adaptive Behavior

“ ‘Adaptive behavior’ means the effectiveness or degree to which the defendant meets the standards of personal independence and social responsibility expected of the defendant’s age and cultural group.” Ariz.Rev.Stat. Ann. § 13-703.02(K)(1). Courts applying this prong must conduct “an overall assessment of the defendant’s ability to meet society’s expectations of him.”18 State v. Grell *1191(“Grell II”), 212 Ariz. 516, 135 P.3d 696, 709 (2006) (en banc); accord Boyston, 298 P.3d at 895.

Although there is scant case law applying this prong, we find the Arizona Supreme Court’s decision in Grell III, 291 P.3d 350 (Ariz.2013), highly instructive. In Grell III, the State stipulated that the capital defendant demonstrated significantly subaverage general intellectual functioning but contested the impairment of his adaptive behavior. Id. at 352. Independently reviewing the evidentiary record, the court proceeded to hold that Grell had also demonstrated significant deficits in adaptive behavior, and reduced his death sentence to natural life. Id. at 351, 357.19 As evidence of significantly impaired adaptive behavior, the court considered, inter alia, Grell’s grade school records showing that he had been placed in special education classes; lay witness testimony describing him as highly impulsive, unable to understand social cues of children his own age, and largely unable to use the few social skills that he had; expert testimony describing Grell’s tendency to act more like children several years younger, and noting his impulsiveness and poor communication skills; and testimony from teachers and administrators who observed that Grell was impulsive, inattentive, and unable to communicate effectively. Id. at 353-55. Grell also adduced testimony from members of the special education team at his elementary school stating their conclusion that Grell was intellectually disabled. Id. at 353. A psychologist opined that the consistency of Grell’s poor social functioning and behavioral problems demonstrated the presence of intellectual disability, because problems arising solely from antisocial or personality disorders would vary over time. Id. at 354. Grell also presented the expert testimony of an educational psychologist who concluded that

[g]iven the facts of [Grell’s] low intellectual functioning, his inability to learn from his mistakes, his reduced capacity in communication, socialization and self-help skills, and his significant history of special education, followed by failure and dropping out of school and, in the absence of significant parental support and guidance, his subsequent serious entanglement with the criminal justice system, it is clear at this point that Shawn Grell is a person who has mental retardation.

Id. at 355. The court additionally noted Grell’s history of running away from home, committing crimes, his inability to hold jobs, and his general immaturity. Id. at 356. Reviewing this evidence, the court concluded that Grell had demonstrated significant deficits in adaptive behavior, notwithstanding evidence of his limited ability to adapt. Id. at 357.

The record in this case paints a remarkably similar picture of Smith, demonstrating consistent traits, beginning in childhood and continuing through the time of *1192the crime, that the Grell III court held established impaired adaptive behavior. See id. at 354 (consistency of behavioral problems indicates a root cause of intellectual disability, rather than antisocial or personality disorders). Like Grell, Smith had a “significant history of special education, followed by failure and dropping out of school.” See id. at 355. Specifically, Smith was held back in all his grades, placed in special education classes, subsequently transferred to a special school for children unable to learn, and dropped out after the eighth grade, by which time he was already sixteen years old. These facts are consistent with other testimony providing further evidence of Smith’s poor intellectual functioning during his childhood. Box testified that as a child, Smith had trouble learning and struggled to grasp the rules even of simple children’s games like tag and marbles. Betty Ruth Knight, another former neighbor and the mother of Smith’s fifth wife, Beth Lewis, stated that as a child Smith “always looked like he was just lost.”

Smith had poor social skills. According to Delores Elaine Long, one of Smith’s childhood neighbors, as a child Smith was unable to interact with, play with, or carry on a conversation with other children. See Grell III, 291 P.3d at 353 (“[Grell] could not understand social cues that children his age should understand, and was largely unable to use the few social skills that he had.”). Hight testified that as a young adult Smith lacked any social life, and Gerald Lambright, the cousin of Smith’s co-defendant Joe Lambright, described Smith as a “loner.” Other evidence reveals Smith’s impulsiveness. Smith’s .mother reported that psychiatrists who treated him during his childhood concluded Smith had problems with impulsiveness, which would likely continue throughout his life. The presentence report also describes Smith’s impulsiveness, stating that he “responds to external, social stimuli on a very concrete level, living basically from day to day and acting on impulse to a great degree.” See id. at 353-55 (noting Grell’s impulsiveness); Atkins, 536 U.S. at 318, 122 S.Ct. 2242 (stating that intellectually disabled people “often act on impulse rather than pursuant to a premeditated plan”).

Smith’s communication skills were similarly stunted. Hight testified that, as an adult, Smith had difficulty forming sentences and correctly pronouncing words; for example, he would say “weekie days” when referring to “weekdays.” Gau and Hight each described receiving nearly incomprehensible letters from Smith during the early period of his incarceration. See Grell III, 291 P.3d at 354-55 (stating that Grell was “unable to communicate effectively” and noting “his reduced capacity in communication”).

As in Grell III, a lay witness familiar with intellectual disability concluded that Smith was intellectually disabled. Here, Hight stated that she believed Smith to be intellectually disabled, based on her comparison of Smith to her own intellectually disabled sister. See id. at 353 (Grell identified as intellectually disabled by special education staff experienced with other disabled children). Although she was not an expert, Hight’s testimony, based on her personal experience, is highly probative of Smith’s adaptive behavior. See Arellano, 143 P.3d at 1020 (discussing the relevance of lay witness. testimony regarding adaptive behavior).

Other evidence indicates that Smith did not possess the skills necessary to take care of his own needs. Hight described Smith as lacking basic hygiene, unable to sit at the table or eat properly, and unable to take care of himself without assistance. Labrecque testified that he was forced to reprimand Smith on one occasion over his sloppiness, body odor, and *1193infrequent bathing. See Grell III, 291 P.3d at 355 (“Dr. Keyes’ investigation revealed that Grell’s family viewed Grell as ‘somewhat incapable of caring for many of his own needs____’ He concluded that Grell’s record confirmed his adaptive deficits, as illustrated by his lifelong inability ‘to conform his behavior to the expected standards of his social and same aged peers.’ ”).

The evidence in this case includes many additional parallels to the evidence presented in Grell. Smith tormented his younger half-sister, sexually abusing her at a young age: when he was twelve, Smith was severely punished after persuading Gau to play “doctor” when she was five years old, and when Gau was nine Smith, then sixteen, brought her out to the garage and forced her to perform oral sex on him. Smith made repeated attempts to run away from home after which he was jailed for vagrancy. He frequently got into trouble for criminal activity, including numerous arrests. Hight testified that Smith was unable to hold a job, which she attributed to his inability to “comprehend what a normal person ... would be able to interpret.” The record reveals that Smith cycled through more than 100 short-term jobs over a period of sixteen years, which Dr. Thompson described as evidence of multiple adaptive impairments.20 Smith also functioned very immaturely as an adult. Long testified that Smith got along well with the children of Beth Lewis, Smith’s fifth wife, because he related to them as a child rather than as an adult: “[H]e was a lot mentality [sic] like they were. I mean, like instead of being a dad figure, he was kind of like they were.” Gerald Lambright testified that, as an adult, Smith was immature, had difficulty interacting with adults his own age, would frequently mimic Donald Duck when he spoke at all, and preferred to interact with children.21 Labrecque characterized Smith’s emotional maturity as resembling that of a twelve to fourteen year-old even a number of years after he had committed the crimes. Multiple medical records appended to Smith’s presentence report describe Smith in his late teens and early twenties as possessing an “immature personality” and exhibiting “immature behavior.” As the Grell III court found, this behavior indicates significant impairment in adaptive behavior under Arizona law. See id. at 356-57 (stating that tormenting other children, running away from home, committing crimes, an inability to hold jobs, and immaturity are among the elements of a mental history that “by itself, provides strong evidence that [an individual] suffered a ‘significant impairment’ in the ability to ‘meet[] the standards of personal independence and social responsibility expected’ of him.” (quoting Ariz.Rev. Stat. Ann. § 13-753(K)(1), (3))).

Additional evidence of Smith’s impaired adaptive behavior not present in Grell makes Smith’s impaired adaptive behavior even clearer. Charles McCarver, who lived in Smith’s apartment complex and worked with Smith repossessing cars, gave testimony describing an incident in which McCarver’s ex-girlfriend Penny jokingly told Smith that he could “have” their son because Smith and his wife were having difficulty conceiving their own child. Following this conversation, Smith called McCarver to say that Penny had told him he could have McCarver’s son. McCarver adamantly refused. Undeterred, Smith *1194showed up at MeCarver’s home expecting to take the boy, changing his mind only after seeing how happy the child was with McCarver. Smith’s absurdly literal interpretation of McCarver’s joke that he would “give” Smith his son vividly demonstrates Smith’s malformed social and communication skills and his general inability to navigate his social world.22

Smith’s mother, Sylvia Scott (Joe Lam-bright’s wife), Gerald Lambright, and the presentence report all described Smith as a follower, a trait the Supreme Court has identified as an indicator of impaired adaptive behavior. Gerald said that Smith would do whatever Joe told him to do, adding that “it was almost like the guy could not think- for himself.” See Atkins, 536 U.S. at 318, 122 S.Ct. 2242 (“in group settings [intellectually disabled people] are followers rather than leaders”). The pre-sentence report describes Smith as having a “borderline personality,” which is also probative of Smith’s condition at the time of Smith’s crime and trial.23

Moreover, Smith demonstrated a lifelong inability to make informed decisions regarding his own safety and welfare. Specifically, Smith was described as having poor judgment as a child and engaging in dangerous behavior without awareness of its risks. As an adult, Smith accepted dares to run across the highway in front of an oncoming truck and climb to the top of a radar tower hundreds of feet tall, where he dangled himself by his arms. He would sometimes go up on the top of buildings where carpentry work was being performed and jump along the beams and rafters without any safety harness. On one occasion while in prison, Smith took a walk along the edge of the roof of a two-story building, earning a rebuke from Labrecque. Such reckless behavior, apparently undertaken without any comprehension of the risks involved, fur*1195ther demonstrates Smith’s inability to meet the expected standards of personal independence.

Although Dr. Martinez viewed Smith’s ability to date women as evidence of his adaptive abilities, that testimony is clearly of little worth. The only evidence of Smith’s romantic life is his five failed marriages, the details of which paint a picture inconsistent with Dr. Martinez’s assessment. Smith’s first three marriages lasted a cumulative total of nineteen months. The presentence report notes that Smith beat his fourth wife, threatened her life, enjoyed tying her up and pretending to rape her, and on other occasions forced her to submit to anal intercourse against her will. Smith married Beth Lewis, his fifth wife, in November 1980, shortly before his arrest. According to Lewis, at one point she decided to end their relationship and Smith became very angry; he grabbed a gun and, shaking it in front of her, said “You want to end it? I can end it for us.” Afraid for her life, Lewis said that she would “do anything.” After contemplating this offer, Smith decided the pair should get married. That same evening, Smith pushed Lewis into the backseat of her car and tore off her pantyhose. Lewis said she began screaming and crying and begged Smith to stop, which he did, leading her to conclude that “[s]o he didn’t actually I guess rape me.” Following this encounter, Smith drove by Lewis’s home on several occasions and waved a pistol at her; the couple married a short time later. We fail to see how Smith’s serial marriages, at least some of which involved death threats as well as incidents of simulated and actual sexual assault, exhibit the “standards of personal independence and social responsibility expected of the defendant’s age and cultural group.” Rather, they further demonstrate the adaptive impairments affecting this and so many other areas of Smith’s childhood and adult life.

Testimony by Dr. Thompson and Dr. Martinez indicating that Smith possessed some adaptive skills does not alter the conclusion that it is highly probable that Smith experienced significant impairment in adaptive behavior at the relevant times. The evidence that Smith exhibited limited adaptive abilities is substantially outweighed by evidence of more far-reaching adaptive impairments. We note, moreover, that Arizona law does not mandate a complete absence of adaptive strengths. See Grell III, 291 P.3d at 357 (“The record also contains some indications of Grell’s limited ability to adapt. Although this evidence makes our decision difficult, a diagnosis of mental retardation, as statutorily defined, does not require a complete absence of adaptive skills.”).

Nor do we regard the Rule 11 reports as inconsistent with our conclusion. As the Supreme Court and our own Court have held, the ultimate conclusions stated in these reports — that Smith understood the difference between right and wrong, and was competent to stand trial — are not inconsistent with intellectual disability. See Atkins, 536 U.S. at 318, 122 S.Ct. 2242 (“Mentally retarded persons frequently know the difference between right and wrong and are competent to stand trial.”); Rohan ex rel. Gates v. Woodford, 334 F.3d 803, 810 n. 3 (9th Cir.2003) (“Incompetence and mental retardation are overlapping but distinct categories. Many retarded individuals are still competent to stand trial.”), abrogated on other grounds by Ryan v. Gonzales, — U.S.—, 133 S.Ct. 696, 184 L.Ed.2d 528 (2013). Nor, for that matter, is Dr. LaWall’s finding that Smith had a personality disorder with antisocial features inconsistent with our conclusion regarding impaired adaptive behavior, especially in light of Smith’s immaturity and childlike conduct. See Grell III, 291 P.3d at 354, 356 (citing expert testimony that “[i]f Grell had a mere conduct or personali*1196ty disorder ... he would have committed acts that were simply against the rules and deviant ..., rather than acting, as he did, in ways that were embarrassing or immature,” and noting that antisocial personality disorder is not inconsistent with intellectual disability); Brumfield v. Cain, — U.S. —, 135 S.Ct. 2269, 2280, 192 L.Ed.2d 356 (2015) (“[A]n antisocial personality is not inconsistent ... with intellectual disability.”).

The vast majority of the evidence strongly points to the conclusion that Smith was unable to “meet[] the standards of personal independence and social responsibility expected of [his] age and cultural group,” both before the age of eighteen and at the time of the crime. Ariz.Rev.Stat. Ann. § 13-703.02(K)(1). Accordingly, we conclude that the state court’s determination that Smith’s pre-ar-rest life did not show significant impairment in adaptive behavior is not fairly supported by the record.

In sum, we conclude that under § 2254(d)(8) the clear weight of the evidence overcomes the presumption of correctness attaching to the state court’s finding that Smith was not intellectually disabled, as well as the state court’s ancillary factual determinations necessary to its ultimate conclusions. Specifically, we have found that the grounds on which the state court discounted Dr. Thompson’s testimony lack fair support in the record and are the product of legal error.

2. The State Court Applied an Unconstitutional Standard of Proof

The state court’s factual determination is not entitled to deference for a separate and independent reason. The Pima County Superior Court found Smith was not intellectually disabled by applying an incorrect and unconstitutional legal standard, a question of law we review de novo.

As the Tenth Circuit has recognized in pre-AEDPA cases, a state court’s factual determination rendered under a constitutionally impermissible legal standard is not entitled to a presumption of correctness. See Lafferty v. Cook, 949 F.2d 1546, 1551 n. 4 (10th Cir.1991) (“The initial inquiry must be whether the Utah court made its fact findings under the correct legal standard of competency. It is elemental that fact finding made under an erroneous view of the governing law cannot be presumed correct. Only after concluding that a state court used the proper standard does a habeas court turn to the issue of the presumption of correctness.”); accord Walker v. Att’y Gen. for Oklahoma, 167 F.3d 1339, 1345 (10th Cir.1999) (“Mr. Walker’s competency was determined under a constitutionally impermissible standard of proof. Such a determination is not entitled to a presumption of correctness.”).24

In the section of the Pima County Superior Court’s decision entitled “Burden of Proof,” the court described the legal standard governing Smith’s Atkins claim.25 *1197The court subsequently analyzed the evidence, after which it set forth its ultimate finding in the final section of its opinion. It concluded that “the circumstances described at the hearing do not point to mental retardation with any degree of certainty.” (Emphasis added.) A court’s recitation of the proper governing legal standard does not insulate its holding from habeas review where the record demonstrates that the court actually applied an unconstitutional standard. See Sears v. Upton, 561 U.S. 945, 952, 130 S.Ct. 3259, 177 L.Ed.2d 1025 (2010) (per curiam) (“Although the court appears to have stated the proper prejudice standard, it did not correctly conceptualize how that standard applies to the circumstances of this case.” (footnote omitted)). Here, because the state court made no other mention of the correct legal standard, and because its analysis provides no indication that the court actually applied the correct legal standard rather than the standard employed when it applied the law to the facts, its boilerplate statements in the introductory “Burden of Proof’ section are of no force or effect.

Under Arizona law, the “any degree of certainty” standard applied by the Pima County Superior Court is more akin to the “reasonable doubt” standard than the clear and convincing standard mandated by Arizona’s Atkins statute, which requires only that the issue under consideration be “highly probable.” See State v. King, 158 Ariz. 419, 763 P.2d 239, 243, 246 (1988) (reversing trial court for providing erroneous jury instructions, and explaining that “[t]he instruction now before us utilized the term ‘certainty’ in defining the clear and convincing standard.... We believe that ‘certainty’ is truer to the concept of proof beyond a reasonable doubt than to the ‘highly probable’ meaning of the clear and convincing standard.”).

To be sure, “[a] state’s misapplication of its own laws does not provide a basis for granting a federal writ of habeas corpus.” Roberts v. Hartley, 640 F.3d 1042, 1046 (9th Cir.2011). A state’s Atkins procedures present a special case, however. “ ‘[Bjecause Atkins reserved for the states the task of developing appropriate ways to enforce the constitutional restriction’ prohibiting the execution of the intellectually disabled, ‘federal courts conducting habeas review routinely look to state law ... in order to determine how Atkins applies to the specific case at hand.’ ” Williams v. Mitchell, 792 F.3d 606, 612 (6th Cir.2015) (quoting Black v. Bell, 664 F.3d 81, 92 (6th Cir.2011)). Stated differently, Atkins leaves to the states the task of developing appropriate procedures to enforce the constitutional right, but constitutionalizes the procedures the state creates. Consequently, where a state court analyzing an Atkins claim fails to follow binding state law, its decision does not simply violate state law, but also violates the Eighth Amendment right provided by Atkins and the violation is therefore cognizable by a federal habeas court. Id. (“[Wjhere a state-court decision is ‘contrary to’ clearly established state supreme court precedent applying Atkins, the decision is ‘contrary to Atkins ’ for purposes of habeas review” under AEDPA); see also Black, 664 F.3d at 97 (“[Bjecause Atkins defers to the individual states to set out the standard for a defendant to qualify as mentally retarded, the [state court’s] misinterpretation of [the state supreme court’s decision] is contrary to Atkins”).

Here, the “certainty” standard applied by the state trial court was plainly con*1198trary to the clear and convincing standard required by Arizona’s statute and adopted by its supreme court. See Ariz.Rev.Stat. Ann. § 13-703.02(G); Grell If 135 P.3d at 701 (“The statute places on ‘the defendant ... the burden of proving mental retardation by clear and convincing evidence’ in the pretrial hearing.” (alteration in original) (quoting § 13-703.02(G))). Accordingly, the standard of proof applied by the state trial court was not simply contrary to state law but was also unconstitutional under Atkins, see Williams, 792 F.3d at 612; Black, 664 F.3d at 97, and, accordingly, the state court’s findings are not due any deference. See Lafferty, 949 F.2d at 1551 n. 4; Walker, 167 F.3d at 1345.

There is another reason the standard of proof applied by the state trial court is unconstitutional, and would be even if it were consistent with state law: a “certainty” standard of proof transgresses the limits of the state’s authority to craft appropriate procedures to enforce Atkins and, in so doing, encroaches on the substantive constitutional right. In reaching this conclusion, it is not necessary to determine what standard of proof the federal Constitution requires, but rather only whether the Arizona court applied a standard it forbids. Cf. Schriro v. Smith, 546 U.S. at 7-8, 126 S.Ct. 7 (state Atkins procedures may, “in their application, be subject to constitutional challenge,” but the state must first have an opportunity to apply them).

In Atkins, the Supreme Court did not announce a specific standard of proof governing claims of intellectual disability. Instead, the Court, citing Ford v. Wainwright, stated that it was “leav[ing] to the States the task of developing appropriate ways to enforce the constitutional restriction upon [their] execution of sentences.” 536 U.S. at 317, 122 S.Ct. 2242 (quoting Ford, 477 U.S. at 405, 416-17, 106 S.Ct. 2595). This did not leave the states unchecked discretion in determining such procedures, however. Rather, to be constitutional, a state’s procedures must constitute “appropriate ways to enforce the constitutional restriction.” Id. (emphases added) (quoting Ford, 477 U.S. at 416, 106 S.Ct. 2595). The Court’s citation to Ford reinforces this view. In Ford, a majority of the Court found Florida’s specific procedures for determining the sanity of a condemned prisoner constitutionally inadequate. See Ford, 477 U.S. at 413, 106 S.Ct. 2595; see also id. at 418, 106 S.Ct. 2595 (plurality opinion); id. at 424-25, 106 S.Ct. 2595 (Powell, J., concurring in part and concurring in.the judgment); id. at 427, 106 S.Ct. 2595 (O’Connor, J., concurring in the result in part and dissenting in part).

When the natural operation of a state’s procedures for rendering factual determinations transgresses a substantive constitutional right, those procedures are unconstitutional. See Bailey v. Alabama, 219 U.S. 219, 239-44, 31 S.Ct. 145, 55 L.Ed. 191 (1911). It is elementary that the “natural operation” of applying a heightened standard of proof can determine the outcome of litigation, and thus the availability of a constitutional right. See id. at 244, 31 S.Ct. 145 (stating that “we must consider the natural operation of the statute here in question”). As the Supreme Court has recognized, it is often impossible to ascertain disputed facts with absolute certainty. Victor v. Nebraska, 511 U.S. 1, 14, 114 S.Ct. 1239, 127 L.Ed.2d 583 (1994). Consequently, “the trier of fact will sometimes, despite his best efforts, be wrong in his factual conclusions.” In re Winship, 397 U.S. 358, 370, 90 S.Ct. 1068, 25 L.Ed.2d 368 (1970) (Harlan, J., concurring). “The function of a standard of proof ... is to ‘instruct the factfinder concerning the degree of confidence our society thinks he should have in the correctness of factual conclusions for a particular type of adjudication.’ ” Addington v. Texas, 441 U.S. 418, *1199423, 99 S.Ct. 1804, 60 L.Ed.2d 323 (1979) (quoting In re Winship, 397 U.S. at 370, 90 S.Ct. 1068 (Harlan, J., concurring)). As a result, “[t]he standard [of proof] serves to allocate the risk of error between the litigants.” Addington, 441 U.S. at 423, 99 S.Ct. 1804; see Cooper v. Oklahoma, 517 U.S. 348, 362, 116 S.Ct. 1373, 134 L.Ed.2d 498 (1996) (“The ‘more stringent the burden of proof a party must bear, the more that party bears the risk of an erroneous decision.’ ” (quoting Cruzan v. Director, Mo. Dept. of Health, 497 U.S. 261, 283, 110 S.Ct. 2841, 111 L.Ed.2d 224 (1990))).

Atkins claims present a heightened risk of an erroneous factual conclusion. Unlike factual determinations in which the basic issue is whether a fact occurred — for example, whether a defendant actually committed the act of which he is accused— determinations like intellectual disability, which depend upon psychiatric diagnosis, turn on an expert’s interpretation of the meaning of various facts. Cf. Addington, 441 U.S. at 429, 99 S.Ct. 1804. As the Supreme Court explained in rejecting the argument that the Constitution requires use of a reasonable doubt standard in the context of civil commitment proceedings, the unique nature of psychiatric diagnosis renders factual determinations uniquely unsusceptible to certainty.

The subtleties and nuances of psychiatric diagnosis render certainties virtually beyond reach in most situations. The reasonable-doubt standard of criminal law functions in its realm because there the standard is addressed to specific, knowable facts. Psychiatric diagnosis, in contrast, is to a large extent based on medical “impressions” drawn from subjective analysis and filtered through the experience of the diagnostician. This process often makes it very difficult for the expert physician to offer definite conclusions about any particular patient. Within the medical discipline, the traditional standard for “factfinding” is a “reasonable medical certainty.” If a trained psychiatrist has difficulty with the categorical “beyond a reasonable doubt” standard, the untrained lay juror — or indeed even a trained judge— who is required to rely upon expert opinion could be forced by the criminal law standard of proof to reject commitment for many patients desperately in need of institutionalized psychiatric care.
We have concluded that the reasonable-doubt standard is inappropriate in civil commitment proceedings because, given the uncertainties of psychiatric diagnosis, it may impose a burden the state cannot meet and thereby erect an unreasonable barrier to needed medical treatment.

Id. at 430-32, 99 S.Ct. 1804 (citations omitted). Similar concerns also arise in other contexts requiring psychiatric diagnosis. See Ford, 477 U.S. at 426, 106 S.Ct. 2595 (Powell, J., concurring in part and concurring in the judgment) (sanity); Cooper, 517 U.S. at 365, 369, 116 S.Ct. 1373 (competency).

The concern espoused in Addington regarding the inherent imprecision of psychiatric determinations of mental illness for the purpose of civil commitment applies with even greater force to psychiatric determinations of intellectual disability under Atkins,26 Unlike civil commitment proceed*1200ings, which inquire into whether an individual is presently mentally ill and poses a danger to himself or others, the age of onset element of Atkins claims requires a retrospective analysis of the individual’s childhood capacity that may be years or, as in this case, even decades removed from the time of trial. Moreover, in cases like this, in which the trial predates Atkins and Petitioner’s claim arises for the first time on habeas, the determination of mental condition at the time of commission of the crime may occur not at trial but rather decades afterwards. Smith’s case illustrates the difficulties that inhere in such an inquiry: as discussed below, records detailing the administration of childhood IQ tests are unavailable, and lay witnesses untrained in psychology are asked to share distant recollections of Petitioner’s behavior as a child and young adult. Certainty is thus even less attainable and a certainty standard is even less constitutionally acceptable in such cases.

Further compounding the likelihood of error in Atkins claims is the fact that the overwhelming majority (85 percent) of individuals with intellectual disability fall into the “mild” category, for whom the likelihood . of misdiagnosis is particularly acute. As young children such individuals are often indistinguishable from children without intellectual disability, and as adults they can acquire social and vocational skills adequate for minimum self-support. DSM-IV 43; see also AAIDD 11th ed., at 47 (“Individuals with [intellectual disability] typically demonstrate both strengths and limitations in adaptive behavior.”). In fact, Daryl Atkins himself maintained that he was only “mildly mentally retarded.” Atkins, 536 U.S. at 308, 122 S.Ct. 2242. However, Atkins applies equally to all intellectually disabled individuals irrespective of the degree of their disability.

Not only are Atkins claims uniquely susceptible to erroneous factual determinations, but they occur in a context — capital punishment — requiring a heightened degree of certainty that the decision is not erroneous. “Because the standard of proof affects the comparative frequency of ... erroneous outcomes, the choice of the standard to be applied in a particular kind of litigation should, in a rational world, reflect an assessment of the comparative social disutility of each.” In re Winship, 397 U.S. at 371, 90 S.Ct. 1068 (Harlan, J., concurring). The Supreme Court’s repeated holdings that capital cases require a heightened degree of certainty that the punishment is lawful make clear its determination that the social “disutility” of a wrongful execution outweighs the “disutility” of errors favoring defendants. See Gilmore v. Taylor, 508 U.S. 333, 342, 113 S.Ct. 2112, 124 L.Ed.2d 306 (1993) (“[T]he Eighth Amendment requires a greater degree of accuracy and factfinding than would be true in a noncapital case.”); Ford, 477 U.S. at 411, 106 S.Ct. 2595 (plurality opinion); Lockett v. Ohio, 438 U.S. 586, 604-05, 98 S.Ct. 2954, 57 L.Ed.2d 973 (1978) (plurality opinion) (“[The] qualitative difference between death and other penalties calls for a greater degree of reliability when the death sentence is imposed .... When the choice is between life and death, [a heightened risk of wrongful execution created by a state statute] is unacceptable and incompatible with the commands of the Eighth and Fourteenth Amendments.”); Woodson v. North Carolina, 428 U.S. 280, 305, 96 S.Ct. 2978, 49 L.Ed.2d 944 (1976) (plurality opinion). Accordingly, where, as in Atkins, the Eighth Amendment renders a class of individuals *1201categorically ineligible for execution, the procedures used to determine whether a defendant falls into that class may not allocate nearly all of the risk of an erroneous determination to the defendant.

By requiring Smith to demonstrate with a “degree of certainty” that he is intellectually disabled, the Arizona court disregarded this fundamental rule. Simply stated, the court took the highly unusual step27 of allocating nearly the entire risk of an erroneous determination to Smith. That the factual determination in question concerned an issue for which certainty may be unattainable, cf. Addington, 441 U.S. at 429-32, 99 S.Ct. 1804, and a penalty for which a greater degree of reliability is required, see, e.g., Gilmore, 508 U.S. at 342, 113 S.Ct. 2112; Lockett, 438 U.S. at 604-05, 98 S.Ct. 2954 (1978) (plurality opinion), renders the constitutional violation even more clear. Like the Alabama statute in Bailey, the standard of proof applied by the Pima County Superior Court in this case transgresses a substantive constitutional right by accomplishing indirectly what the state may not do directly: the execution of individuals who are intellectually disabled under Atkins. See Bailey, 219 U.S. at 239, 31 S.Ct. 145 (“It is apparent that a constitutional prohibition cannot be transgressed indirectly by the creation of a [procedural rule] any more than it can be violated by direct enactment.”); Atkins, 536 U.S. at 321, 122 S.Ct. 2242. Because it impairs the substantive right, the state court’s “certainty” standard of proof is not an “appropriate way[ ] to enforce the constitutional [protection]” mandated by Atkins. Atkins, 536 U.S. at 317, 122 S.Ct. 2242. In short, the Constitution forbids requiring a defendant to demonstrate intellectually disability with “any degree of certainty.”28 Because the Pima County Superior Court made its finding that Smith is not intellectually disabled by applying an incorrect and unduly onerous legal standard, its ultimate factual determination is not consonant with the Eighth Amendment. A finding that is made pursuant to the wrong legal standard is not a finding at all. Accordingly, the state court’s application of an unconstitutional standard of proof provides an independent and alternative ground for denying its determination a presumption of correctness.

D. Whether Smith is Intellectually Disabled

Having determined that the state court’s determination is not entitled to a presumption of correctness, we must review the record de novo to determine whether Smith has demonstrated intellectual disability by clear and convincing evidence, as required by Arizona law. For all the reasons set forth in Section II.C.l, we hold that he has. Considering Smith’s intellectual functioning test scores and his history of significantly impaired adaptive behavior, as we must under Atkins and Hall, we find that the record in this case overwhelmingly demonstrates that Smith satisfied the two substantive prongs of Ari*1202zona’s definition of intellectual disability both prior to age eighteen and at the time of the crime. Specifically, Smith’s Otis test score of 62, combined with his poor academic performance, clearly demonstrates the childhood onset of his significantly subaverage general intellectual functioning. The record further demonstrates that, consistent with Dr. Thompson’s testimony, Smith also experienced this condition at the time of the crime: improvement in Smith’s intellectual functioning did not occur until after his incarceration in a structured environment, when he began receiving appropriate antidepressant medication as well as tutoring from Labrecque and Schad. The many parallels between Smith’s life and that of the capital defendant in Grell, including Smith’s stunted communication skills, lack of personal care skills, severe immaturity, and inability to maintain employment and personal relationships, reveal his significant impairment in adaptive behavior as a child and at the time of the crime, as does his general lifelong inability to navigate his social world.

There can be no doubt that the crime in this case was truly horrific. The Constitution, however, regards intellectually disabled defendants as less morally culpable for their crimes, and for this reason prohibits their execution. Atkins, 536 U.S. at 316, 122 S.Ct. 2242; Hall, 134 S.Ct. at 1992. Viewing the record as a whole, we find that Smith has demonstrated by clear and convincing evidence significantly sub-average general intellectual functioning existing concurrently with significant impairment in adaptive behavior, and that both conditions were manifested prior to age eighteen and at the time Smith committed the capital offense. The overwhelming weight of the evidence compels this result. Smith is intellectually disabled and may not be executed. Atkins, 536 U.S. at 316, 122 S.Ct. 2242; Hall, 134 S.Ct. at 1992. Accordingly, we reverse Smith’s death sentence and remand to the district court with instructions to grant the writ as to his capital sentence.

CONCLUSION

The judgment of the district court is reversed. We remand with instructions to grant the writ with respect to the penalty phase and return the case to the state court to reduce Smith’s sentence to life or natural life.

REVERSED AND REMANDED.

. Judge Reinhardt’s opinion is the opinion of the court except for Section II.C.2. in which neither Judge Schroeder nor Judge Callahan joins.

. Because we grant relief on the Atkins claim, we find it unnecessary to reach Smith’s claim of ineffective assistance of counsel.

. Because the lengthy factual and procedural history of this case is known to the parties and set forth in prior opinions, we recount only those portions directly relevant to the issues discussed herein.

.Although both the parties and prior opinions in this case use the term "mental retardation,” we employ the term "intellectually disabled.” See Hall v. Florida,-U.S.-, 134 S.Ct. 1986, 1990, 188 L.Ed.2d 1007 (2014). We use "mental retardation” only when quoting material employing that term.

. The Fourth and Fifth Circuits have held that the question of whether a person is intellectually disabled under Atkins constitutes an issue of fact. See Walker v. Kelly, 593 F.3d 319, 323 (4th Cir.2010); Maldonado v. Thaler, 625 F.3d 229, 236 (5th Cir.2010). The Nevada, Pennsylvania, and Tennessee Supreme Courts have held that the question is instead a mixed question of law and fact. Ybarra v. State, 247 P.3d 269, 276 (Nev.2011); Commonwealth v. Crawley, 592 Pa. 222, 924 A.2d 612, 615 (2007); State v. Strode, 232 S.W.3d 1, 8 (Tenn.2007). We have not yet decided the issue in our Circuit, but have held in a separate context that the question of intellectual disability is a mixed question of law and fact. See Gregory K. v. Longview Sch. Dist., 811 F.2d 1307, 1310 (9th Cir.1987) (whether student was intellectually disabled, as defined by state regulations, for the purpose of the federal Education for All Handicapped Children Act is a mixed question of law and fact).

. Section 13-703.02 was subsequently renumbered as § 13-753. 2008 Ariz. Sess. Laws, Ch. 301, § 26. In 2011, the statute was amended to substitute the term "intellectual disability” for "mental retardation.” Ariz. Sess. Laws 2011, Ch. 89, § 5. Unless otherwise stated, all references to § 13-703.02 are to the version in effect at the time of Smith’s Atkins evidentiary hearing.

. The mere fact that the record contains contrary opinions by two expert witnesses does not render it ambiguous. Once we look behind each expert’s conclusion and consider the evidence on which he relies, it becomes clear that the great majority of the evidence strongly reinforces Dr. Thompson's opinion and that Dr. Martinez’s contrary conclusion lacks even fair evidentiary support.

. When a state court's decision to discount certain evidence constitutes a factual determination, we may apply § 2254(d)(8) to determine whether deference is due. See Carriger, 132 F.3d at 473-76, 478 (applying § 2254(d)(8) to reject the state court’s ancillary factual determination that a witness lacked credibility and holding, based in part upon our decision to credit that witness's testimony, that the petitioner had satisfied the Schlup "miscarriage of justice” standard); see also Schlup v. Delo, 513 U.S. 298, 314, 115 S.Ct. 851, 130 L.Ed.2d 808 (1995). Where a state court’s decision to discount certain evidence results from legal error, the presumption of correctness does not apply. See Sivak, 658 F.3d at 905.

. Because it applies equally to the time of the crime and trial, the constitutional right announced in Atkins is unlike the rights provided by Pate v. Robinson, 383 U.S. 375, 378, 86 S.Ct. 836, 15 L.Ed.2d 815 (1966) (right to not to be tried while legally incompetent), and Ford v. Wainwright, 477 U.S. 399, 409-10, 106 S.Ct. 2595, 91 L.Ed.2d 335 (1986) (right not to be executed while insane), which attach, respectively, to the time of trial and execution.

. Many states expressly recognize that Atkins applies to individuals who may be deemed intellectually disabled at the time the crime was committed or at trial. See, e.g., Smith v. State, No. 1060427, - So.3d -, -, 2007 WL 1519869, at *8 (Ala. May 25, 2007); Ark.Code Ann. § 5-4-618(b); Del.Code Ann. tit. 11, § 4209(d)(3)(c); Ga.Code Ann. § 17-7-131(c); Pizzuto v. State, 146 Idaho 720, 202 P.3d 642, 653, 654 (2008); Chase v. State, 171 So.3d 463, 468 (Miss.2015); S.D. Codified Laws § 23A-27A-26.1; Tenn.Code Ann. § 39-13-203(b); Ex parte Cathey, 451 S.W.3d 1, 19 (Tex.Crim.App.2014); Wash. Rev.Code § 10.95.030(2).

. The version of the statute in effect at the time of Smith’s evidentiary hearing uses an identical definition. Ariz.Rev.Stat. Ann. § 13-703.02(K)(5).

. Given the substantial time between the commission of the crime and the IQ tests administered by Drs. Thompson and Martinez, this inference is not particularly strong. Moreover, as discussed below, substantial evidence demonstrates that Smith’s IQ did in fact fall below the threshold necessary to demonstrate significantly subaverage general intellectual functioning at the time the crime was committed.

. While the standard for overcoming the statutory presumption of intellectual disability is not particularly clear, the general rule in Arizona suggests it is low. Cf. State v. Lewis, 236 Ariz. 336, 340 P.3d 415, 420 (App.2014) (”[A]s with other rebuttable presumptions, the presumption of continued incompetence ‘disappears entirely upon the introduction of any contradicting evidence and when such evidence is introduced the existence or non-existence of the presumed [incompetence] is to be determined exactly as if no presumption had ever been operative.' ” (quoting Sheehan v. Pima Cnty., 135 Ariz. 235, 660 P.2d 486, 489 (App.1982))).

. Courts have taken a range of approaches with regard to the Flynn effect. Some courts have gone beyond the Fourth and Eleventh Circuits by mandating its application to defendants’ IQ scores. See Thomas v. Allen, 614 F.Supp.2d 1257, 1281 (N.D.Ala.2009) ("A court must also consider the Flynn effect and the standard error of measurement in determining whether a petitioner's IQ score falls within a range containing scores that are less than 70.”); United States v. Parker, 65 MJ 626, 629 (NM.Ct.Crim.App.2007); People v. Superior Court,-Cal.App.4th-, 28 Cal.Rptr.3d 529, 558-559 (2005), overruled on other grounds ("In determining [a petitioner's] IQ score, consideration must be given to the so-called Flynn effect”). Other courts have left to the trial court’s discretion whether to apply the Flynn Effect. See State v. Burke, No. 04AP-1234, 2005 WL 3557641, at *13 (Ohio Ct.App. Dec. 30, 2005) ("We conclude that a trial court must consider evidence presented on the Flynn effect, but, consistent with its prerogative to determine the persuasiveness of the evidence, the trial court is not bound to, but may, conclude the Flynn effect is a factor in a defendant’s IQ score.”). Still other courts have rejected use of the Flynn Effect. See Bowling v. Commonwealth, 163 S.W.3d 361 (Ky.2005) (neither the Flynn effect nor standard margins of error properly are considered); Howell v. State, 151 S.W.3d 450, 458 (Tenn.2004); Neal v. State, 256 S.W.3d 264, 273 (Tex.Crim.App.2008) (“We have previously refrained from applying the Flynn effect, however, noting that it is an 'unexamined scientific concept’ that does not provide a reliable basis for concluding that an appellant has significant sub-average general intellectual functioning.”); In re Mathis, 483 F.3d 395, 398 n. 1 (5th Cir.2007) ("The Flynn Effect ... has not been accepted in this Circuit as scientifically valid.”).

. The state court also noted that expert testimony regarding Quantitative Electronence-phalography (QEEG) testing on Smith, which it held inadmissible, played a role in Dr. Thompson’s opinion that Smith's functional limitations were related to his frontal lobe dysfunction. Because testimony regarding the QEEG testing played a non-essential and limited role in Dr. Thompson’s conclusion, his opinion cannot be discounted on this basis.

. The state court committed legal error when it discounted Dr. Thompson’s opinion that Smith's abusive upbringing contributed to his intellectual disability, which was manifested by poor test scores and grades, and instead adopted the state’s theory that Smith’s abusive upbringing itself caused his poor academic performance but that he was not intellectually disabled. The state’s theory misapprehends Arizona’s definition of intellectual *1188disability, which centers on indicators such as low IQ scores and impaired adaptive behavior and not the purported etiology of these indicators. See Ariz.Rev.Stat. Ann. § 13-703.02(K)(1), (3), (5). Simply stated, while the specific cause of intellectual disability is significant with regard to whether the condition is static or mutable, the threshold question whether an individual is intellectually disabled is answered simply by the presence of impaired, functioning regardless of its purported cause. See AAIDD 11th ed., supra, at 59-61 (describing intellectual disability as arising from cultural-familial factors, biological factors, or a combination of the two, and stating that "[b]ecause [intellectual disability] is characterized by impaired functioning, its etiology is whatever caused this impairment in functioning.”).

. The State’s citation to cases describing intellectual disability as a static condition does not alter our conclusion, Heller v. Doe, 509 U.S. 312, 323, 113 S.Ct. 2637, 125 L.Ed.2d 257 (1993); Moormann v. Schriro, 672 F.3d 644, 649 (9th Cir.2012); State v. Arellano, 213 Ariz. 474, 143 P.3d 1015, 1020 (2006). Moormann and Arellano each rely on Heller, which cites a 1985 report for the proposition that intellectual disability "is a permanent, relatively static condition.” Heller, 509 U.S. at 323, 113 S.Ct. 2637 (citing Samuel J. Brak-el et al., The Mentally Disabled and the Law 37 (3d ed.1985)). Thus, all of this case law relies on a single study that substantially predates developments in the clinical understanding of intellectual disability as a fluid condition subject to change. See Am. Ass'n on Mental Retardation, Mental Retardation: Definition, Classification, and Systems of Supports 1, 5 (9th ed.1992) [hereinafter AAMR 9th ed.] ("With appropriate supports over a sustained period, the life functioning of the person with mental retardation will generally improve.”); id. at 18 ("Mental retardation begins prior to age 18 but may not be of lifelong duration.”); Am. Psychiatric Ass'n, Diagnostic and Statistical Manual of Mental Disorders 47 (4th ed.2000) [hereinafter DSM-IV] ("Mental Retardation is not necessarily a lifelong disorder. Individuals who had Mild Mental Retardation earlier in their lives manifested by failure in academic learning tasks may, with appropriate training and opportunities, develop good adaptive skills in other domains and may no longer have the level of impairment required for a diagnosis of Mental Retardation.”); AAIDD 11th ed., supra, at xiii (“ID is no longer considered entirely an absolute, invariant trait of the person.”). This contemporary clinical understanding necessarily informs the law on intellectual disability. See Hall v. Florida,-U.S.-, 134 S.Ct. 1986, 1993, 188 L.Ed.2d 1007 (2014) (stating that legal definitions of intellectual disability "are informed by the work of medical experts”). In addition, unlike the record in this case, none of the cases on which the State relies involves an evidentiary record containing extensive expert testimony describing intellectual disability as a condition that is neither fixed nor static where (as in Smith's case) it is influenced by environmental factors rather than an underlying medical condition.

. Although state courts generally construed Atkins as imposing no binding definition of impaired adaptive behavior, the Supreme Court held in Hall that states must comply with elements of the clinical definition about which there exists a national consensus. 134 S.Ct. at 1998-99. Because Arizona’s definition of adaptive behavior is far more restrictive than the clinical definition, Williams, 303 P.3d at 548 (Eckerstrom, P.J., dissenting), and because a national consensus exists with regard to this aspect of the clinical definition, see Concurring Opinion of Reinhardt, J., Arizona’s definition may well be violative of the rules established in Hall, and unconstitutional for that reason. Because Hall was not decided until after the state court had rendered its decision denying Smith’s Atkins claim, however, Smith had no opportunity to make this argument before the state court and the state had no opportunity to respond. In such cir*1191cumstance, we might remand to allow the state court to consider the more recent Supreme Court decision, although we express no view on that question. Here, however, we need not apply Hall in light of our conclusion that Smith clearly satisfies even Arizona’s more onerous standard.

. Due to the unique procedural posture of the case, the Grell III court applied

Ariz.Rev.Stat. Ann. § 13-751(C). Under this statute, a defendant may present evidence at the penalty phase of mitigating circumstances, which must be proven by a preponderance of the evidence. We note that the Grell III court applied a lower standard of proof than governs Smith’s claim, but nonetheless regard the case as a useful guidepost demonstrating the Arizona Supreme Court’s approach to the adaptive behavior prong. Grell III contains the Court’s most extended analysis on this element and identifies numerous attributes supporting a finding of significantly impaired adaptive behavior.

. Dr. Martinez attributed Smith's inability to hold jobs in part to Smith's impulsivity, itself an indicator of impaired adaptive behavior. See Grell III, 291 P.3d at 353-55; Atkins, 536 U.S. at 318, 122 S.Ct. 2242.

. The presentence report lists "Duck” and "Crazy Duck” as aliases for Smith.

. The record does not fairly support the state court’s decision to discount Hight’s testimony that Smith resembled her intellectually disabled sister on the sole ground that it was inconsistent with testimony by McCarver and a second lay witness, Sidney LeBlanc, who lived in Smith’s apartment building and drove trucks for the same company as Smith. We credit Hight’s testimony because, among these witnesses, only she had firsthand experience with someone diagnosed with intellectual disability. See Grell, 291 P.3d at 353; Arel-lano, 143 P.3d at 1020. More important, testimony by McCarver and LeBlanc does not contradict Hight’s assessment of Smith as intellectually disabled. McCarver's testimony, as recounted above, strongly reinforces the conclusion that Smith was intellectually disabled. LeBlanc’s explanation that he and Smith held a large number of short-term jobs as a result of their "footloose and fancy free” transient lifestyle, is at least as suggestive of traits indicating intellectual disability as of the standard of adaptive behavior expected of Smith’s age and cultural group. Accordingly, we find the state court’s determination that McCarver and LeBlanc were more credible witnesses than Hight is not fairly supported by the record.

. Dr. Thompson testified that he found this assessment in the presentence report credible notwithstanding the lack of evidence regarding the author’s level of training because, in his experience working with the Department of Corrections, the probation and parole officers who wrote such reports had experience with prisoners and diagnostic evaluations that provided them a reasonable basis from which to determine whether an individual has low or borderline cognitive functioning and because the report’s findings are corroborated by substantial evidence of Smith's impaired mental functioning. We agree that the report constitutes probative lay witness testimony of Smith’s disability. See Arellano, 143 P.3d at 1020. Accordingly, and in light of the foregoing discussion of the limitations of the Rule 11 reports, we find the state court’s determination to afford the Rule 11 reports greater weight than the presentence report, and its critique of Dr. Thompson's contrary conclusion, lacks fair support in the record.

. The dissent in Lafferty contended that the majority erred by inserting an additional, preliminary step in its review of the state court's factual determination, contrary to the requirements of 28 U.S.C. § 2254(d). 949 F.2d at 1558-59 & nn. 2-3 (Brorby, J., dissenting). The majority is clearly correct. Because a factual determination rendered under an erroneously inflated and unconstitutional legal standard does not resolve the question of whether the same answer obtains under the correct lower standard, the factual determination rendered cannot be said to constitute a valid finding. Where a state court has made no valid finding, there is nothing to which a presumption of correctness may attach.

. In its introductory statement regarding the standard of review, the state court stated that Smith had the burden of proving his Atkins claim by clear and convincing evidence. It added that, due to the unique procedural posture of the case, it would also apply the preponderance of the evidence standard applicable to Rule 32 proceedings, and that its decision would be the same un*1197der that lower standard. The body of the opinion did not fulfill that promise, however, but, rather, the court concluded after reviewing all the evidence that it did not meet a “certainty” standard.

. It is of no consequence to the analysis that Addington and Atkins involve different burdens of proof than the case at bar, because the focus here is on the effect of the standard of proof. Under Addington, a state desiring the civil commitment of an individual must demonstrate that he suffers from mental illness, whereas under Atkins an individual seeking to avoid execution by the state must demonstrate intellectual disability. In both situations, the determination heavily relies upon psychiatric opinion, and thus in both situations a standard of proof requiring "any degree of cer*1200tainty” as defined by Arizona law will often render it impossible for a party to cariy its burden. See Addington, 441 U.S. at 432, 99 S.Ct. 1804.

. Only Georgia applies a more onerous standard, requiring proof of intellectual disability beyond a reasonable doubt. By contrast, every other state to establish a standard of proof imposes a more relaxed standard than the state court applied here. In addition to Arizona, only four states — Colorado, Delaware, Florida, and North Carolina — apply even a clearly convincing standard, and the remaining twenty-two states imposing the death penalty and the federal government apply a preponderance standard. Hill v. Humphrey, 662 F.3d 1335, 1365 n. 1 (11th Cir.2011) (en banc) (Barkett, J., dissenting).

. Because Smith satisfies the lower "clearly convincing” standard required by Arizona’s Atkins statute, it can be assumed without deciding that the statutory standard is constitutional. However, many of the concerns expressed here apply to the clear and convincing standard as well.