Partial Concurrence and Partial Dissent by Judge BYBEE
OPINION
CHEN, District Judge:Plaintiff-Appellant Daniel G. Demer filed suit, pursuant to the Employee Retirement Income Security Act of 1974 (“ERISA”), against Defendants-Appellees IBM Corporation LTD Plan (the “Plan”) and Metropolitan Life Insurance Company (“MetLife”). Mr. Demer claimed that Met-Life, the claim administrator and insurer for the Plan, improperly denied his claim for long-term disability (“LTD”) benefits. See 29 U.S.C. § 1132(a)(1)(B) (providing that “[a] civil action may be brought ... by a participant or beneficiary ... to recover benefits due to him under the terms of his plan, to enforce his rights under the terms of the plan, or to clarify his rights to future benefits under the terms of the plan”). The district court denied Mr. Dem-er’s motion for summary judgment, granted Defendants’ cross-motion, and entered judgment in favor of Defendants.
We reverse the district court’s entry of judgment in Defendants’ favor and remand to the district court with instructions to *896remand this ease to MetLife to re-evaluate the merits of Mr. Demer’s LTD claim.
I.
A. Mr. Demer’s Claim for LTD Benefits
Mr. Demer was an employee of IBM Corporation and a participant in the Plan. MetLife is the claim administrator for and insurer of the Plan. The parties agree that the Plan gives MetLife, as the administrator, discretionary authority to interpret the Plan and determine benefits eligibility. Where, as here, an ERISA plan confers discretionary authority on the plan administrator as a matter of contractual agreement, then the standard of review is abuse of discretion rather than de novo. See Tapley v. Locals 302 & 612 of the Int’l Union of Operating Eng’rs-Employers Constr. Indus. Ret. Plan, 728 F.3d 1134, 1139 (9th Cir. 2013) (“Where an ERISA Plan grants ‘discretionary authority to determine eligibility for benefits or to construe the terms of the plan,’ ‘a plan administrator’s interpretation of a plan’ is reviewed for abuse of discretion. We review the district court’s application of this standard de novo.”); Abatie v. Alta Health & Life Ins. Co., 458 F.3d 955, 963 (9th Cir. 2006) (en banc) (“[I]f the plan does confer discretionary authority as a matter of contractual agreement, then the standard of review shifts [from de novo] to abuse of discretion.”) (emphasis omitted).
• The Plan provides that a participant is disabled and eligible for LTD benefits if,
during the elimination period and the first 12 months after you complete the elimination period, you cannot perform the important duties of your regular job [ie., your own occupation] with IBM because of a sickness or injury. After expiration of that 12 month period, disabled means that, because of a sickness or injury, you cannot perform the important duties of any other gainful occupation for which you are reasonably fit by your education, training or experience.
“[G]ainful occupation” means “occupations [for which] you are reasonably qualified based on your education, training, experience, and functional ability” and further, in Mr. Demer’s case, “provides gainful wages of $4,240.48 per month or $24.46 hourly,” ie., the equivalent of a yearly salary of approximately $50,000.
Mr. Demer stopped working at IBM on January 9, 2009, because of a disability. At the time, he was a Lead Internal Auditor at IBM. He began receiving short term disability (“STD”) benefits. In March 2009, he filed a claim for LTD benefits pursuant to the Plan (because his STD benefits were due to expire soon). In his application for LTD benefits, Mr. Demer stated: “I am unable to do my job duties due to severe recurrent depression and spinal stenosis, chronic headaches.” Symptoms included “chronic headaches, chronic back and neck pain, myalgia, severe depression, [and] sciatica.”
On July 28, 2009, MetLife approved Mr. Demer’s claim for LTD benefits under the “own occupation” test for disability articulated in the Plan. MetLife noted that the test for disability would eventually switch to the “any occupation” test on July 11, 2010. MetLife further noted that it was limiting Mr. Demer’s benefits to a period of twenty-four months because his primary diagnosis was a mental or nervous disorder.
Subsequently, in November 2009, Met-Life sent a letter to Mr. Demer, reminding him that, for his benefits to continue (beyond July 11, 2010), he would have to be disabled under the “any occupation” test for disability.
Mr. Demer thereafter submitted statements and’ medical records from numerous treating physicians, including but not limit*897ed to his primary care doctor, Dr. Stephen Moore; a treating neurologist, Dr. David Weidman; and a treating pain management physician, Dr. Robert Osborne. These doctors discussed not only mental impairments suffered by Mr. Demer but also physical impairments. For example:
• In a statement from February 2010, Dr. Weidman referred to “chronic os-teoarthritic pain and depression inter-netting] with each other.” Dr. Weid-man also indicated that Mr. Demer’s physical condition had deteriorated since April 2009 — e.g., in April 2009, Dr. Weidman had concluded that Mr. Demer could intermittently sit for 4-5 hours, intermittently stand for 4-5 hours, and occasionally lift 11-20 pounds; but, in February 2010, Dr. Weidman determined that Mr. Demer could only intermittently stand for 1-2 hours and never lift 11-20 pounds.
• In a medical record dated February 2010, Dr. Osborne referred to severe cervical and lumbosacral spine disease with radiculopathy and chronic depression. Notably, Dr. Osborne found that Mr. Demer had severe limitations as a result of his physical impairments— e.g., able to intermittently sit for 1 hour, intermittently stand for 0-1 hour, and intermittently walk for 0-1 hour and never able to lift up to 10 pounds.
• Dr. Moore, Mr. Demer’s primary care physician, had a similar, albeit slightly more positive, view with respect to Mr. Demer’s physical limitations, opining, e.g., that Mr. Demer could continuously sit for 1 hour and continuously stand and walk for 0-1 hour and could occasionally lift 21-50 pounds.
On October 1, 2010, MetLife denied Mr. Demer’s claim for LTD benefits under the “any occupation” test for disability. In its denial, MetLife relied in large part on the opinion of an independent physician consultant (“IPC”), Dr. Elyssa Del Valle, internal medicine, who conducted only a paper review of Mr. Demer’s file — ie., she did not personally perform a physical or mental examination of Mr. Demer. Dr. Del Valle concluded that “[t]he medical information does support functional limitations ... due to severe degenerative disc disease, degenerative vertebral disease with numerous levels of the cervical, thoracic and lumbar spine associated with neural foraminal narrowing as well as spinal sten-osis.” She also stated that “[t]he condition is associated with chronic pain necessitating narcotic analgesics despite trigger point injections, cervical and lumbar epidural injections and physical therapy.” But Dr. Del Valle disagreed with the physical capacity assessments of Dr. Moore and Dr. Osborne because they “would indicate that [Mr. Demer] is bedridden for more than 20 hours a day.” Dr. Del Valle also indicated that she agreed with an older assessment made by Dr. Weidman (from April 2009),1 noting that, although it was more than a year old, “there are no clinical data/findings to indicate any change in his overall condition” (opining, inter alia, that Mr. Demer could walk 3^4 hours intermittently and that he “should avoid any prolonged periods of sitting, standing or walking more than 30 minutes”). In its decision, MetLife determined that, even with the limitations identified by Dr. Del Valle, Mr. Demer “should "Joe able to perform at the sedentary to light level of physical exertion as defined by the U.S. Department of Labor” and therefore denied Mr. Demer LTD benefits.
B. The Appeal
In March 2011, Mr. Demer appealed MetLife’s denial of LTD benefits. In his *898appeal, Mr. Demer asserted that he “has severe degenerative disc disease (‘DDD’) of the cervical and lumbar spine,” for which there was “further progression [as] reflected in the cervical MRI performed June 21, 2010.” He also claimed that he “suffers radiculopathy,” “has a history of significant headaches,” and has “ongoing nerve compression.” Finally, he pointed out that he “takes powerful narcotic and other medications” which -“have known side effects causing fatigue and reduced ability to concentrate.” As noted above, MetLife’s own IPC, Dr. Del Valle, acknowledged that Mr. Demer had chronic pain that necessitated narcotic analgesics.
In support of his appeal, Mr. Demer provided, e.g., additional information from Dr. Osborne. Dr. Osborne stated, inter alia, that
the overall picture is one of a gentleman with severe spinal deterioration at all components of the spine as well as neu-rophysiological evidence of a delayed conduction (spine cord problem) of the bilateral Posterior Tibial Nerves to the cerebral cortex as well as a separate focal left L5 nerve root lesion (diagnostic SSEP and diagnostic L5 radiculopathy).
Dr. Osborne further stated that “[t]he overall treatment plan has included chronic narcotic medication in attempt to control his overall pain” which has side effects that “limit the ability to complete productive mental functions.” Mr. Demer also provided third-party witness statements from his brother (Frank Demer) and a friend (Shirley Piel) and a personal statement in support of his appeal. Both Ms. Piel’s statement and Mr. Demer’s personal statement addressed, inter alia, the impact Mr. Demer’s medications had on his1 mental ability to function.
MetLife denied the appeal, this time relying on the opinions of two different IPCs, namely, Dr. Marcus Goldman, Board Certified in psychiatry, and Dr. Dennis S. Gordan, Board Certified in physical medicine and rehabilitation. Like Dr. Del Valle, Dr. Goldman and Dr. Gordan conducted only paper reviews of Mr. Demer’s file without any personal examination.
With regard to mental functional limitations, Dr. Goldman stated that, “[g]iven the lack of recent data and the paucity of any compelling objective findings, as well as the lack of serial mental status examinations, this reviewer would be unable to establish the presence of an impairing mental condition.”
With regard to physical functional limitations, Dr. Gordan acknowledged that there was “documented anatomical cervical spinal stenosis, degenerative disc disease, and degenerative facet disease of the spine, as well as degenerative arthritis of the left hip.” He disagreed, however, that Mr. Demer suffered from a radiculopathy based on his interpretation of the medical evidence. Dr. Gordan also indicated that Dr. Osborne’s impressions may have been colored by Mr. Demer’s “dire” account of his history, “a reversal of his prior positive attitude ... about the effectiveness of the previous interventional procedures and medications.” In addition, Dr. Gordan relayed a conversation he had with Dr. Moore (Mr. Demer’s primary care physician) in which Dr. Moore said “he thought it was likely that [Mr. Demer] could do a very sedentary job, but ... felt that he would have to see him again to say that definitively.” Dr. Gordan ultimately concluded that Mr. Demer “likely had a modicum of discomfort” from, inter alia, “neck and back pain related to spinal degeneration, and referred pain down the limbs from those degenerative changes,” but Mr. Demer retained the physical functional capacity to, e.g., “sit[ ] for an hour at a time ... and up to 7 hours a day, stand[ ] and walk[ ] for 15 minutes at a time and up to 2 *899hours a day, lift[] up to 10 pounds frequently, 20 pounds occasionally.”
In addition to the above, both Dr. Goldman and Dr. Gordan addressed the specific issue raised by Mr. Demer in his appeal that the medications prescribed for his physical condition affected his ability to mentally function. According to Dr. Goldman, “there clearly are no objective or other compelling or convincing data to establish functional impairment as a result of Mr. Demer’s psychotropic medications.” Dr. Gordan stated: “There is no specific information about medications taken or effects from them during the period in question. Although Dr. Osborne asserted that the claimant’s needed narcotic medication caused cognitive side effects, there was never any evidence of that.”
In denying Mr. Demer’s appeal, MetLife appears to have accepted Dr. Gordan’s physical capacity assessment. MetLife also appears not to have placed any mental limitations on Mr. Demer as a result of his medications. Based on the physical capacity assessment and lack of any cognitive limitation, and an occupation assessment conducted by a vocational rehabilitation consultant based thereon, MetLife concluded that Mr. Demer could work in certain sedentary occupations, such as Project Director and Computer Security Coordinator.2
C. District Court Proceedings
Following MetLife’s denial of LTD benefits, Mr. Demer initiated this lawsuit. In reviewing MetLife’s denial of benefits, the district court applied the abuse-of-discretion standard and rejected Mr. Demer’s contention that the abuse-of-discretion review must be tempered with skepticism because of a conflict of interest on the part of MetLife. See Demer v. IBM Corp., 975 F.Supp.2d 1059, 1076-77 (D. Ariz. 2018). The district court found that
the record taken as a whole establishes that MetLife reasonably relied on its IPCs’ reports. Every doctor agreed that Plaintiff suffered from a combination of depression and chronic pain syndrome, but every doctor also had a different opinion as to Plaintiffs future functionality. MetLife was required to choose between divergent opinions. MetLife’s decision to rely on its IPCs’ findings was reasonable.
Id. at 1083.
II.
We first address whether MetLife had a conflict of interest such that our *900review should be tempered by skepticism. See Harlick v. Blue Shield of California, 686 F.3d 699, 707 (9th Cir. 2012). A conflict of interest is a factor in the abuse-of-discretion review, the weight of which depends on the severity of the conflict. See id.; see also Renfro v. Funky Door Long Term Disability Plan, 686 F.3d 1044, 1048 (9th Cir. 2012) (noting that, “if the plan gives discretion, but the administrator operates under a conflict of interest, then ‘the conflict of interest must be weighed as a factor in determining whether there is an abuse of discretion’”) (quoting Met. Life Ins. Co. v. Glenn, 554 U.S. 105, 110-11, 128 S.Ct. 2343 (2008)); Stephan v. Unum Life Ins. Co. of America, 697 F.3d 917, 929 (9th Cir. 2011) (noting that degree of skepticism in determining whether administrator abused its discretion varies based on extent of conflict of interest); Montour v. Hartford Life & Acc. Ins. Co., 588 F.3d 623, 630-31 (9th Cir. 2009) (stating that the extent of a conflict of interest affects its weight in the overall analysis of whether an abuse of discretion occurred).3
In the instant case, the evidence of a conflict of interest on which Mr. Demer relies consists of the following: (1) MetLife is both the claim administrator for the Plan and its insurer and (2) at least two of the IPCs that MetLife hired to review the medical record (Dr. Del Valle and Dr. Gor-dan) have performed a significant number of reviews for MetLife and have received significant compensation for their services.
‘“We review de novo a district court’s choice and application of the standard of review to decisions by fiduciaries in ERISA cases.’ ” Prichard v. Metro. Life Ins. Co., 783 F.3d 1166, 1168 (9th Cir. 2015).
A. Structural Conflict of Interest
In its opinion, the district court acknowledged that MetLife has a structural conflict of interest because MetLife both evaluates claims made against the Plan and funds claims. See Montour, 588 F.3d at 630 (noting that, when “the same entity that funds an ERISA benefits plan also evaluates claims, ... the plan administrator faces a structural conflict of interest: since it is also the insurer, benefits are paid out of the administrator’s own pocket, so by denying benefits, the administrator retains money for itself’). However, the district court applied no skepticism as a result of the structural conflict because “MetLife has taken affirmative steps to reduce potential bias and promote accurate claim determinations.” Demer, 975 F.Supp.2d at 1076; see also MetLife, 554 U.S. at 117, 128 S.Ct. 2343 (noting that a conflict of interest “should prove less important (perhaps to the vanishing point) where the administrator has taken active steps to reduce potential bias and to promote accuracy, for example, by walling off claims administrators from those interested in firm finances, or by imposing management checks that penalize inaccurate decisionmaking irrespective of whom the inaccuracy benefits”).
*901Mr. Demer objects to the district court’s reliance on the declarations from two employees, Gregory Hafner and Laura Sullivan, who describe the affirmative steps taken by MetLife to reduce its structural conflict, on the ground that neither Mr. Hafner nor Ms. Sullivan was disclosed as a witness in MetLife’s initial disclosures as required by Federal Rule of Civil Procedure 26. MetLife did not explain its failure to identify witnesses in its mandatory initial disclosures; on the other hand, Mr. Demer did not explain his failure to take a 30(b)(6) deposition on the structural conflict issue. See James v. AT&T West Disability Benefits Program, 41 F.Supp.3d 849, 871 (N.D. Cal. 2014) (finding defendant’s failure to disclose conflict-of-interest declarations harmless because “plaintiff had ample time to seek discovery, but did not do so — [thus] she cannot credibly claim prejudice.”).
We need not resolve this issue because, even assuming there is no residual structural conflict (ie., because of affirmative steps taken by MetLife to insulate its claims department), some skepticism is warranted here because of the financial conflict of the IPCs upon whom Met Life relied.4
B. Financial Conflict of Independent Physician Consultants
Mr. Demer claims MetLife operated under a conflict of interest because two of the IPCs that MetLife hired to review the medical record, Dr. Del Valle and Dr. Gordan, have done a substantial number of reviews for MetLife and received significant compensation from MetLife for their services. For 2009 and 2010, Dr. Del Valle performed more than 250 reviews/adden-dums each year and earned more than $125,000 each year; for the same time period, Dr. Gordan performed between 200-300 reviews/addendums each year and earned more than $175,000 each year. Based on the number of reviews and the amount of compensation, Mr. Demer asserts that the opinions of Dr. Del Valle and Dr. Gordan should be questioned because the doctors had financial incentives to render opinions favorable to MetLife. Mr. Demer further argues that, because Met-Life relied on the doctors’ opinions in denying him relief, the doctors’ conflict is, in effect, imparted to MetLife.
As a preliminary matter, we note that Mr. Demer’s argument here is comparable to conventional approaches to discrediting the testimony of retained experts whose objectivity may be challenged based on, e.g., the number of times he or she has served as an expert in support of a party and the amount of compensation received. This alleged conflict of interest is distinct from the purported structural conflict of interest discussed above. The lack of any structural conflict of interest on the part of MetLife does not preclude MetLife from having a conflict -of interest based on an IPC’s financial interests; the factors that raise the possibility of a structural conflict relate to the incentives applicable to Met-Life’s claims department, whereas the factors that raise the possibility of a financial *902conflict relate to the incentives applicable to MetLife’s retained, experts. Even if Met-Life operated with no structural conflict, reliance on the reports of its retained experts who have a financial incentive to make findings favorable to MetLife may-warrant skepticism.
We further take note that it is Mr. Demer’s burden, as the party claiming a conflict, to produce evidence of a financial conflict sufficient to warrant a degree of skepticism. Placing the burden on Mr. Demer, as an initial matter, makes sense given that he is asking for a departure from the otherwise applicable standard of review for abuse of discretion. Once such evidence is produced, however, the burden then shifts to MetLife to produce evidence that there is no conflict.5 Cf. Muniz v. Amec Constr. Mgmt., 623 F.3d 1290, 1295 (9th Cir. 2010) (in discussing a structural conflict of interest, stating, “when a claimant produces evidence that a plan administrator’s self-interest caused a breach of the administrator’s fiduciary obligations to the claimant, a rebuttable presumption arises in favor of the claimant and the plan bears the burden of proving that a conflict of interest did not affect its decision to deny or terminate benefits”); see also Estate of Barton v. ADT Sec. Servs. Pension Plan, 820 F.3d 1060, 1065-66 (9th Cir. 2016) (indicating that a plaintiff fairly bears the burden of proving entitlement to ERISA benefits where he or she has better or at least equal access to the evidence needed to prove entitlement; in certain cases, however, the defending entity solely controls the information that determines entitlement).
We conclude that Mr. Demer has satisfied his burden of production. Mr. Demer has offered evidence that the IPCs have earned a substantial amount of money from MetLife ($125,000-$175,000 each year) and have performed a substantial number of reviews for the company as well (200-300 reviews/addendums each year). The magnitudes of these numbers, particularly when combined, raise a fair inference that there is a financial conflict which influenced the IPCs’ assessments, and thus such conflict should be considered as a factor in reviewing MetLife’s decision for abuse of discretion. See Montour, 588 F.3d at 634 (“how frequently [the insurance company] contracts with the file reviewers it employed in this case” is relevant to ascertaining conflict); Nolan v. Heald College, 551 F.3d 1148, 1152 & n.3 (9th Cir. 2009) (evidence that the outside medical reviewers “received substantial work and monies from MetLife in the three-to-four years preceding and including [the claimant’s] benefits denial” could be a factor tempering abuse of discretion review). Here, the evidence of the IPCs’ financial conflict of interest, in the absence of contrary evidence from MetLife, warrants some skepticism in reviewing MetLife’s decision.
To be sure, the lack of more powerful evidence that, e.g., the IPCs had “ ‘some specific stake in the outcome’ ” of Mr. Demer’s case, McDonald v. Hartford Life Group Ins. Co., 361 Fed.Appx. 599, 610 *903(5th Cir. 2010),6 or of statistics showing a parsimonious pattern of assessments disfa-vorable to claimants, see Montour, 588 F.3d at 634,7 minimizes the “weight [assigned] to the conflict of interest as a factor in the overall analysis of whether an abuse of discretion occurred.” Id. at 631. But that lack of such specific evidence does not mean that there is no conflict of interest. Here, we have evidence of not only the frequency of reviews for MetLife but also the significant dollar amounts earned by the reviewers.
Furthermore, that Mr. Demer could have, but did not, develop a stronger record of the IPCs’ conflict of interest does not mean that there is no conflict. Because Mr. Demer did provide evidence of a financial conflict warranting an inference of bias, the burden shifted to MetLife to counter that evidence. As we noted in Montour, both the plaintiff and the administrator ran a risk in not developing evidence of bias or lack thereof. See id. at 634 (before addressing plaintiffs failure to submit extrinsic evidence of bias such as statistics of rate of claims denied or frequency of file reviews, court took note of administrator’s “failure to present extrinsic evidence of any effort on its part to ‘assure accurate claims assessment’ ”). Here, MetLife could have maintained records of its reviewers’ findings on claims to show their neutrality in practice,8 but it did not. While MetLife therefore missed an opportunity to negate any inference of a financial conflict of interest, Mr. Demer failed as well to develop more powerful evidence that could have established enhanced skepticism in reviewing MetLife’s decision. Thus, we find there is neither a lack of conflict of interest (justifying no skepticism) nor a substantial conflict of interest (warranting enhanced skepticism). Instead, the financial conflict — modest but extant — warrants some, but not substantial, weight under Abatie and Montour.
The dissent argues that MetLife “listened very carefully” to our instruction in Abatie that plan administrators may reduce conflicts by “referring] medical evaluations to outside experts, such as doctors, who also have no interest in firm finances,” and that for its trouble we give MetLife additional scrutiny. But the dissent fails to acknowledge that Abatie considered an ad*904ministrator’s use of “truly independent medical examiners or a neutral, independent review process.” Abatie, 458 F.3d at 969 & n.7 (emphasis added). The dissent mistakenly equates outside experts with independent experts, but the former does not guarantee the latter. We do not quarrel with the notion that using outside medical evaluators can be an important step toward the goal of obtaining neutral assessments, but it is not hard to imagine an outside medical examiner who does not engage in a neutral, independent review, such as where the examiner receives hundreds of thousands of dollars from a single source and performs hundreds of reviews for that source every year.
Despite the dissent’s suggestion that the majority disapproves of outside reviewers, we imply no such disapproval; we simply apply the unremarkable proposition that the number of examinations referred and the size of the professional fees paid to a reviewer may compromise the neutrality of an expert. See Montour, 588 F.3d at 634; Nolan, 551 F.3d at 1152, n.3. The extra-circuit decisions the dissent cites do not stand for the proposition that outside experts are immune from judicial scrutiny for possible bias. While the formulation in determining whether a financial conflict of interest exists may be stated in various ways, we think it is clear under the facts in this case where MetLife paid substantial monies for a high volume of repeat work to the IPCs involved and there is no evidence rebutting an inference of bias, there is sufficient evidence of a financial conflict to temper abuse of discretion review.9
C. Evidence of Mental Limitations
Having concluded that the abuse-of-discretion review should be tempered with some skepticism, we now turn to Mr. Demer’s contention that MetLife abused its discretion in denying his claim for benefits because it did not find his mental functional capacity was affected in any way by the medications he was taking for his physical pain. As indicated above, MetLife did not ask its vocational rehabilitation consultant to consider any limitation on Mr. Demer’s mental ability to function. We conclude that MetLife abused its discretion in denying Mr. Demer’s claim.
In reaching this conclusion, we first take note of three points that are essentially undisputed:
(1) Mr. Demer “takes powerful narcotic and other medications, prescribed in attempts to manage his pain.” These medications included morphine.
(2) These medications were medically necessary to address Mr. Demer’s pain arising from physical impairments. (MetLife’s IPC Dr. Del Valle noting that Mr. Demer’s physical problems “necessitate] narcotic analgesics”).
(3) The “prescribed narcotic and neurological oriented medications have known side effects” on an individual’s mental functioning. (Mr. Dem-er’s treating physician Dr. Osborne *905stating that “[t]he side effects with dosing of narcotics, limit the ability to complete productive mental functions!;] [t]hey are to be expected and are limits of the only treatment available for this gentleman”).
Moreover, in a personal statement, Mr. Demer claimed that he did, in fact, suffer side effects as a result of his medications, including fatigue and difficulty with concentration (e.g., the medications “cause me to fatigue and, and they help confuse me in my thinking and ability to communicate”; “I can no longer read complex materials because I cannot concentrate to comprehend them”; and “I also have memory lapses after having read the pages I may still be looking at”).
Mr. Demer corroborated his claim with a statement from a friend, Ms. Piel (e.g., she “know[s] [Mr.] Demer”; “[t]here has been a sharp decline in his well being during the past ten years”; she has viewed his physical pain; Mr. Demer has side effects from the prescribed medication which makes him “consistently appear[ ] to be in a haze, unable to cope with what were once routine matters”; and Mr. Demer “has repeatedly demonstrated his inability to safely drive because of the inability to focus”).10
He also pointed to supporting contemporaneous evidence from his treating physicians. For example, Dr. Osborne expressed agreement that his physical examinations and medical records indicated that Mr. Demer was suffering from side effects of his medications, “which infringe on [his] ability to concentrate and tend to diminish [his] energy.” Dr. Moore commented that Mr. Demer “has cognitive limitations [secondary to] pain as well as analgesics.”
Despite this evidence, MetLife rejected any mental limitations based on the opinions of two IPCs, Drs. Goldman and Dr. Gordan, neither of whom actually examined Mr. Demer. (Dr. Goldman stated that, “[b]eyond October 29, 2010, there clearly are no objective or other compelling or convincing data to establish functional impairment as a result of Mr. Demer’s psychotropic medications.” Dr. Gordan stating that “[t]here is no specific information about medications taken or effects from them during the period in question!;] [although Dr. Osborne asserted that the claimant’s needed narcotic medication caused cognitive side effects, there was never any evidence of that.”) Implicit in each doctor’s opinion — and therefore Met-Life’s decision — was a conclusion that Mr. Demer’s complaints of fatigue and difficulty concentrating were not credible.
But the IPCs had little basis for rejecting Mr. Demer’s credibility. In addition to the fact that the IPCs never examined Mr. Demer, they never explained specifically why they rejected Mr. Demer’s claim of mental function limitations when (1) he was taking what are undisputedly powerful narcotic medications and (2) his subjective complaints were corroborated by his treating physicians as well as a friend (Ms. Piel).11 See Godmar v. Hewlett-Packard Co., 631 Fed.Appx. 397, 406 (6th Cir. 2015) (stating that “there is ‘nothing inherently objectionable about a [paper] review,’ ” but such “reviews are particularly troubling when the administrator’s consulting physicians — who have never met the claimant— *906discount the claimant’s limitations as subjective or exaggerated”; adding that “ ‘we will not credit a file review to the extent that it relies on adverse credibility findings when the files do not state that there is reason to doubt the applicant’s credibility’ ”); Montour, 588 F.3d at 634-35 (indicating that a plan should not require a claimant to provide objective proof of his pain level and that a plan should not reject subjective claims of excess pain based solely on a paper review’s observation that a physical impairment should not cause the claimant as much pain as he was reportedly suffering); cf. Rollins v. Massanuri, 261 F.3d 853, 857 (9th Cir. 2001) (in the Social Security context, noting that “subjective pain testimony cannot be rejected on the sole ground that it is not fully corroborated by objective medical evidence”).
We acknowledge that the district court’s order suggests possible grounds for questioning Mr. Demer’s credibility' — i.e., that his activities of daily living indicated some ability to engage in mental functioning. See Demer, 975 F.Supp.2d at 1081 (stating that “Dr. Osborne’s opinion that Plaintiff could not operate a vehicle was directly contradicted by Plaintiffs conversations with MetLife on January 14, 2010 and May 18, 2010, where he stated that he had been driving a vehicle[;] [fjurther, while receiving disability payments, Demer told a Met-Life claims representative that ‘he was just completing online courses’ ”); (Met-Life’s electronic diary notes). But neither MetLife nor its IPCs rejected Mr. Dem-er’s credibility on this basis. See Harlick, 686 F.3d at 719-20 (stating that “[t]he general rule ... is that a court will not allow an ERISA plan administrator to assert a reason for denial of benefits that it had not given during the administrative process”). Moreover, it is not clear that these activities of daily living necessarily establish an ability to work within the meaning of the Plan. Notably, under the terms of the Plan, Mr. Demer is eligible for LTD benefits if he cannot engage in a “gainful occupation,” which in Mr. Demer’s case is a job that has a yearly salary of approximately $50,000. A job that commands such a salary may well require higher levels of mental functioning, including concentration and memory, both of which are areas where Mr. Demer has claimed impairment as a result of his medications.
D. Evidence of Physical Limitations
There is an additional factor weighing in favor of finding an abuse of discretion by MetLife. In denying Mr. Demer’s appeal, MetLife effectively adopted the physical functional capacity assessed by Dr. Gor-dan — ie.,
that Mr. Demer would be capable of sitting for an hour at a time, with short breaks for stretching, up to seven hours a day; standing and walking for 15 minutes at a time and up to two hours a day; lifting up to 10 pounds frequently, 20 pounds occasionally .and 35 pounds rarely; occasionally twisting, bending, stooping, and reaching above shoulder level, driving, and doing repetitive movements with either hand and occasionally climbing stairs.
Similar to above, Dr. Gordan was implicitly rejecting Mr. Demer’s credibility based solely on a paper review without having physically examined him and without explaining why Mr. Demer’s credibility was lacking, particularly, in light of some medical records conflicting with Dr. Gor-dan’s physical functional capacity assessment. Most notably, Dr. Gordan’s assessment conflicted with the more restrictive assessment adopted by MetLife’s other IPC, Dr. Del Valle, which MetLife had previously adopted in initially denying Mr. Demer benefits. For instance, with respect to lifting capacity, Dr. Gordan found that *907Mr. Demer could lift up to 10 pounds frequently, but previously, MetLife found (as part of its initial denial) that Mr. Dem-er could not frequently lift more than 10 pounds. Also, whereas Dr. Gordan found that Mr. Demer could sit with breaks up to seven hours a day, MetLife previously found (based on Dr. Del Valle’s initial assessment) that he could only sit “4-6 hours per 8 hour work day with proper ergonomics and the ability to change position as needed.” MetLife never explained why it concluded that Dr. Gordan’s assessment was more appropriate over Dr. Del Valle’s earlier assessment, particularly since the record indicated that Mr. Demer’s condition did not improved (and may have deteriorated) over time.12
E. Conclusion
Taking into account the totality of the circumstances — i.e., the financial conflict of interest of the IPCs on whom MetLife relied (which warrants some skepticism in reviewing the IPCs’ conclusions), the substantial evidence of Mr. Demer’s mental limitations due to pain medication and physical limitations, and the IPCs’ reviews of Mr. Demer’s condition, without having examined him and without explaining why they rejected his credibility, particularly in light of evidence corroborating his credibility (both medical and nonmedical) — Met-Life abused its discretion in denying Mr. Demer’s claim for LTD benefits.
F. Remedy
The question remaining is what remedy should issue. See Cook v. Liberty Life Assurance Co. of Boston, 320 F.3d 11, 24 (1st Cir. 2003) (“Once a court finds that an administrator has acted arbitrarily and capriciously in denying a claim for benefits, the court can either remand the case to the administrator for a renewed evaluation of the claimant’s case, or it can award a retroactive reinstatement of benefits.”). We hold that a remand to the district court, with instructions to remand to Met-Life, is appropriate. An award of benefits is not a proper remedy because the record does not clearly establish that MetLife should necessarily have awarded Mr. Dem-er benefits. Cf. Grosz-Salomon v. Paul Revere Life Ins. Co., 237 F.3d 1154, 1163 (9th Cir. 2001) (“[Retroactive reinstatement of benefits is appropriate in ERISA cases where ... ‘but for [the insurer’s] arbitrary and capricious conduct, [the insured] would have continued to receive the benefits’ or where ‘there [was] no evidence in the record to support a termination or denial of benefits.’ ”).
To be clear, on remand, MetLife may reopen the record to consider additional evidence regarding mental limitations. The record as it stands does not show precisely what Mr. Demer’s limitations were as a result of the medications. While a retrospective evaluation may be difficult given the passage of time, a retrospective evaluation of Mr. Demer’s limitations is not necessarily impossible. Indeed, in the Social Security context, retrospective evaluations are not uncommon. Historical records, data and trends may be relevant and useful in rendering a retrospective evaluation. See, e.g., Smith v. Bowen, 849 F.2d 1222, 1225 (9th Cir. 1988) (in Social Security case, stating that “reports containing observations made after the period for dis*908ability are relevant to assess the claimant’s disability!;] [i]t is obvious that medical reports are inevitably rendered retrospectively and should not be disregarded solely on that basis”). Furthermore, a current evaluation of Mr. Demer may be particularly useful because his benefit period may have extended beyond the date of the appeal, see 2ER 130, 217 (addressing Maximum Benefit Period), such that a current examination may be closer in time to the assessment period than it would otherwise appear.
III.
Accordingly, we REVERSE and REMAND with instructions to the district court to remand this case to MetLife so that it may re-evaluate the merits of Mr. Demer’s LTD claim.
. As indicated above, Dr. Weidman submitted a more recent statement from February 2010, indicating that Mr. Demer's condition had worsened after April 2009.
. After MetLife denied his appeal, Mr. Demer sent a letter to MetLife, claiming that there was additional information from Dr. Weid-man and Dr. Moore that had been sent to MetLife prior to the decision on appeal but that had not been addressed in the decision on appeal. Mr. Demer attached that information to his letter. That information included, inter alia, a treatment note from Dr. Weidman indicating that Mr. Demer was on a higher amount of opiate analgesics which seemed to cause a slight change in his speech and a treatment note from Dr. Moore stating that Mr. Demer was taking opiates which affected his cognition and executive functioning, including memory. The documentation was reviewed by a MetLife appeals nurse consultant. “The nurse consultant opined that while some current clinical exam changes were noted, no additional clinical findings were submitted relating to the appeal period in question....” MetLife also noted that the Plan had only “one level of appeal" and that appeal had already been denied on May 6, 2011.
At the trial level, the district court refused to consider Mr. Demer's post-appeal evidence. Mr. Demer now argues that this refusal was erroneous. For purposes of this appeal, we need not decide the issue of whether the post-appeal evidence should have been considered. Even without the post-appeal evidence, Mr. Demer is entitled to a remand, as discussed below.
. The dissent is critical of our review for abuse of discretion with skepticism because, inter alia, the term skepticism is “not descriptive in some useful way,” and even less so when "modified by a raft of adjectives.” However, the framework employing abuse of discretion review subject to some degree of skepticism (where warranted) is well established under both Glenn and Ninth Circuit law. See Glenn, 554 U.S. at 117, 128 S.Ct. 2343 (noting that requiring consideration of a conflict of interest as a factor "is no stranger to the judicial system” as "[n]ot only trust law, but also administrative law, can ask judges to determine lawfulness by taking account of several different, often case-specific, factors, reaching a result by weighing all together”); Abatie, 458 F.3d at 969 (noting that "abuse of discretion review, with any 'conflict ... weighed as a factor,' is indefinite” but "trial courts are familiar with the process of weighing a conflict of interest”).
. The district court did not err in considering the Social Security Administration’s ("SSA") denial of Mr. Demer’s claim for disability benefits as additional evidence that MetLife did not have a conflict of interest. See Demer, 975 F.Supp.2d at 1077 (stating that, “[a]l-though not a decision by an administrative law judge, the SSA's findings support the objectivity of MetLife's review of the medical evidence”). Contrary to what Mr. Demer suggests, the district court did not rely on the SSA decision to support MetLife’s ruling on the merits. See Hariick, 686 F.3d at 719-20 (stating that "[t]he general rule ... is that a court will not allow an ERISA plan administrator to assert a reason for denial of benefits that it had not given during the administrative process").
. In so ruling, we acknowledge the Supreme Court’s statement in Glenn that that it did not "believe it necessary or desirable for courts to create special burden-of-proof rules, or other special procedural or evidentiary rules, focused narrowly upon the evaluator/payor conflict.” Glenn, 554 U.S. at 116, 128 S.Ct. 2343. However, we do not read this language as barring the burden approach articulated above. Glenn's statement was directed at the issue, once a conflict is identified, *' 'how' the conflict ... should ‘be taken into account on judicial review of a discretionary benefit determination.’ " Id. at 115, 128 S.Ct. 2343 (citation omitted). The Supreme Court did not consider and hence did not foreclose an articulation of the burdens in determining whether there is a cognizable conflict in the first place.
.See also Davis v. Unum Life Ins. Co. of Am., 444 F.3d 569, 575-76 (7th Cir. 2006) (rejecting contention that “in-house doctors have an inherent conflict in every case”; noting lack of evidence of "any specific incentive [for the in-house doctors] to derail [a] claim” — e.g., giving the doctors "some specific stake in the outcome of [a] case, such as paying the doctors more if [the] claim were denied”). While such a stake would be strong evidence of conflict, as a practical matter, this seems a highly unlikely scenario. It is hard to imagine that a plan administrator would explicitly tie compensation to results, as clearly such a practice would be viewed disapprovingly by courts. To the extent the dissent suggests that no financial conflict of interest may be found absent evidence that an IPC has a specific stake in the outcome of the case, such a rule, if adopted, would render review for financial conflict of interest toothless. Nothing we said in Montour suggests such a categorical rule.
. See, e.g., Caplan v. CNA Fin. Corp., 544 F.Supp.2d 984, 992 (N.D. Cal. 2008) (noting that a doctor "stood to benefit financially from the repeat business that might come from providing [defendant] with reports that were to its liking”; adding that "[t]he history of' [the doctor’s] conclusions provides evidence of this conflict”).
. For example, MetLife could have provided but did not proffer evidence that the IPCs being challenged do not in fact have a parsimonious pattern of assessment unfavorable to claimants. Nor did MetLife submit evidence, e.g., that the fees paid to the IPCs constitute only a small fraction of their income or that the high number of reviews conducted by an IPC in a particular year was an aberration.
. Contrary to what the dissent seems to suggest, bias of an IPC may be inferred even where the IPC is not entirely "financially dependent” on income received from an administrator. Obtaining even, e.g., 30% of one's income from one administrator could be sufficiently influential as to give rise to a reasonable inference of bias. See, e.g., Nolan, 551 F.3d at 1152 n.3 (30% of reviewer’s income came from administrator). Moreover, were claimants required to show financial dependence in order to establish lack of neutrality, the personal financial circumstances and needs of each IPC could be subjected to routine inquiry in ERISA cases, a result hardly conducive to the recruitment of competent reviewers or to the efficient and expeditious review of benefit decisions contemplated by ERISA.
. Similarly, a medical record indicated that Mr. Demer's father told him “he was ‘druggy’ ” after being prescribed certain medication (medical record from treating physician, Dr. Debra Weidman (anesthesiologist)).
. Because Ms. Piel’s statement was submitted only on the appeal to MetLife, Dr. Del Valle was not able to consider it. However, Dr. Gordan should have taken into account Ms. Piel’s statement as he was the IPC on appeal.
. We acknowledge that Dr. Gordan did consult with Mr. Demer's treating physician, Dr. Moore, and Dr. Moore indicated to Dr. Gor-dan that "it was likely that Mr. Demer could do a very sedentary job.” Dr. Moore, however, added that "he felt that he would have to see [Mr. Demer] again to say that definitively.” Yet Dr. Gordan never physically or mentally examined Mr. Demer for confirmation one way or the other; nor did Dr. Moore. Hence, any reliance by MetLife on Dr. Moore's statement is misplaced.