Zen Magnets, LLC v. Consumer Product Safety Commission

EBEL, Circuit Judge.

Petitioner Zen Magnets, LLC (“Zen”) challenges a regulation promulgated by Respondent Consumer Product Safety Commission (“the Commission”) restricting the size and strength of the rare earth magnets that Zen sells. See Final Rule: Safety Standard for Magnet Sets, 79 Fed. Reg. 59,962 (Oct. 3, 2014) (codified at 16 C.F.R. §§ 1240.1-1240.5). We conclude that the Commission’s prerequisite factual findings, which are compulsory under the Consumer Product Safety Act, 15 U.S.C. §§ 2051-2089, are incomplete and inadequately explained. Accordingly, we VACATE and REMAND to the Commission.

I. BACKGROUND

This case concerns sets of small, high-powered magnets (“magnet sets”) that users can arrange and rearrange in various geometric designs. The component magnets are unusually small (their diameters are approximately five millimeters) and unusually powerful (due to rare earth metal cores, their magnetic flux index1 ranges from 400 to 500 kG^nm2). A set typically comprises on the order of 100 to 200 identical spherical magnets, coated in reflective silver or other bright colors. Magnets of this type were introduced to the United States circa 2009. They have since been marketed and sold to consumers—by Zen and other distributors—as desktop trinkets, stress-relief puzzles, and toys, and apparently also for educational and scientific purposes.

Although the strength of these magnets is part of their appeal, it can also pose a grave danger when the magnets are misused. Specifically, if two or more magnets are ingested—a temptation to which children are especially at risk—they can cause serious damage to intestinal tissue that becomes tightly clamped between them. Attendant medical consequences can include hospitalization and surgery for such injuries as perforations, infections, gastrointestinal bleeding, and tissue death. The danger is compounded when parents and medical personnel remain unaware of the type of magnets ingested and their heightened risks.

That danger caught the attention of the Consumer Product Safety Commission. The Commission is an independent regulatory agency that administers and enforces the Consumer Product Safety Act (“the Act”), 15 U.S.C. §§ 2051-2089, a primary purpose of which is “to protect the public against unreasonable risks of injury associated with consumer products,” id. § 2051(b)(1). In pursuit of that goal, the Commission is authorized to “promulgate consumer product safety standards” establishing performance or warning requirements for consumer products, id. § 2056(a), as well as to ban hazardous products altogether, id. § 2057.

*1145The Commission’s regulatory approach towards magnet sets progressed as follows. In 2008, Congress adopted as mandatory safety standards certain requirements developed by the American Society for Testing and Materials (“ASTM”) to address hazards associated with children’s toys. See generally 15 U.S.C. § 2056b; AR 142. With respect to magnets, those requirements prohibit any product ’ “designed, manufactured, or marketed as a plaything for children under 14 years of age” from containing a loose magnet that (1) has a' flux index greater than 50 kG2 mm2 and (2) is small enough to fit within a standardized “small parts cylinder.”2 ASTM International Standard F968-11 Consumer Safety Specifications for Toy Safety §§ 3.1.37 (definition of “hazardous magnet”), 3.1.81 (definition of “toy”), 4.38-4,38.1 (prohibition of hazardous magnets in toys), and Fig. 3 (defining the small parts cylinder’s dimensions to be a diameter of 31.7 mm with a height that, due to a sloped bottom surface, ranges from 25.4 mm on one side to 57.1 mm on the opposite side).3 The purpose of those restrictions is to ensure that permissible magnets are either large enough to discourage ingestion or weak enough to avoid tissue strangulation upon ingestion. See ASTM F963-11 4.38; cf. Final Rule: Safety Standard for Magnet Sets, 79 Fed. Reg., at 59,968.

During 2011, in response to reports of injured children, Commission staff began evaluating whether the magnet sets currently on the market complied with ASTM F963 (“the toy standard”). The Commission found that the individual magnets in those sets tended to be ten times more powerful—or, alternatively, six times smaller—than is permissible to market to children under the toy standard. See ASTM F963-11 §§ 3.1.37, 4.38-4.3.81; Final Rule: Safety Standard for Magnet Sets, 79 Fed. Reg. at 59,976-77. Accordingly, Commission staff issued Notices of Noncompliance to companies that labeled or marketed these powerful magnet sets to appeal to children younger than fourteen years old, and warned other firms not to market their sets to children below that age.4

Some distributors took steps to comply with the toy standard, including implementing labeling enhancements and marketing restrictions. However, “into spring 2012, staff continued to identify additional firms offering [magnet sets] on the Internet with labeling and marketing violations.” Proposed Rule: Safety Standard for Magnet Sets, 77 Fed. Reg. 53,781, 53,782 (proposed Sept. 4, 2012) (to be codified at 16 C.F.R. §§ 1240.1-1240.5). Moreover, reports of child injuries from magnet ingestion continued.

*1146So the Commission stepped up its enforcement efforts. In May 2012, the Commission required the thirteen leading magnet set distributors to report any information of which they were aware reasonably supporting the conclusion that their magnets did not comply with an applicable safety standard, contained a defect, or created an unreasonable risk of serious injury. See 15 U.S.C. § 2064(b) (requiring distributors to report potential noncompliance with safety standards, defects, and risk of serious injury). Based on that information, by July 2012 Commission staff had negotiated agreements with ten of those companies to cease importation and distribution of magnet sets. Commission staff then initiated administrative complaints against the remaining three companies (including Zen), arguing that their magnet sets constituted “substantial product hazards” that must be prohibited and recalled because they failed to comply with the toy standard and/or contained a product defect.5 See 15 U.S.C. § 2064(a) (defining “substantial product hazard” to be a product that either (1) fails to comply with an applicable safety standard or (2) contains a product defect), (c) (authorizing the Commission to order a seller to cease distributing and to recall products that constitute a “substantial product hazard”).

Four months after eliminating ten of the leading magnet set distributors, the Commission proposed a new safety standard aimed at regulating the size and strength of all magnet sets. See Proposed Rule: Safety Standard for Magnet Sets, 77 Fed. Reg. 53,781. In effect, the proposed standard extended the size and strength restrictions applicable to children’s toys under ASTM F963 to magnets marketed, intended, or used for adult entertainment. After receiving comments and holding a public hearing, the Commission promulgated the proposed rule as a final safety standard on October 3, 2014. See Final Rule: Safety Standard for Magnet Sets, 79 Fed. Reg. at 59,962, 59,966-72.

The final rule requires that, “Each magnet in a magnet set ... that fits completely within the cylinder described in 16 CFR 1501.4 must have a flux index of 50 kG2 mm2 or less when tested in accordance with the method described in § 1240.4.” 16 C.F.R. § 1240.3. The referenced cylinder is the same small parts cylinder as that used in the toy standard. Compare 16 C.F.R. § 1501.4 with ASTM F963-11 § 3.1.37 and Fig. 3. And the flux index limit of 50 kG2mm2 is the same limit as that used in the toy standard. See 16 C.F.R. § 1240.4 (incorporating the flux index measurement procedure of ASTM F963-11 §§ 8.24.1-8.24.3). As a result, the primary difference between the two standards is their scope of intended consumers. Unlike the toy standard, the final rule is hot limited to magnets designed or marketed as toys for children under fourteen years of age, but rather applies to all magnet sets that meet the following definition: “Any aggregation of separable magnetic objects that is a consumer product intended, marketed or commonly used as a manipulative or construction item for entertainment.”6 16 C.F.R. § 1240.2(b).

Zen is the only remaining importer and distributor of the magnet sets targeted by the final rule. Over the years, Zen has made efforts to comply with the toy standard by implementing fourteen-and-under age restrictions and placing warnings on *1147its website and packaging, as well as by-imposing sales restrictions on its retail distributors. Its magnet sets, however, do not comply with the strength and size restrictions of the final rule set forth at 16.C.F.R. § 1240.3. Accordingly, Zen seeks review of that safety standard pursuant to 15 U.S.C. § 2060(a), which provides that any person adversely affected by a rule promulgated by the Commission “may file a petition with the United States court of appeals ... for the circuit in which such person ... resides or has his principal place of business for judicial review of such rule.”

II. DISCUSSION

Exercising jurisdiction pursuant to 15 U.S.C. § 2060(c), we review the magnet set safety standard in accordance with the provisions for judicial review set forth in the Administrative Procedures Act (“APA”), 5 U.S.C. ch. 7. See 15 U.S.C. § 2060(c) (“[T]he court shall have jurisdiction to review the consumer product safety rule in accordance with chapter 7 of title 5, and to grant appropriate relief ... as provided in such chapter_”). Accordingly, “our review is ‘very deferential to the agency.’ ” Andalex Res., Inc. v. Mine Safety & Health Admin., 792 F.3d 1252, 1257 (10th Cir. 2015) (quoting Ron Peterson Firearms, LLC v. Jones, 760 F.3d 1147, 1161 (10th Cir. 2014)). Notwithstanding that deferential standard, we conclude that the Commission failed to meet the Consumer Product Safety Act’s requirements for issuing a safety standard, for the reasons explained below.

I. The Consumer Product Safety Act

Broadly speaking, the Act sets forth a two-step process for promulgating a safety standard. See D. D. Bean & Sons Co. v. Consumer Prod. Safety Comm’n, 574 F.2d 643, 649 (1st Cir. 1978). First, the Commission must “consider” and “make appropriate findings” regarding the social and economic costs and benefits of the rule. See 15 U.S.C. § 2058(f)(1), (2). Specifically, the Commission must make findings identifying (1) the degree and nature of the risk of injury sought to be prevented; (2) the approximate number and type of products subject to the rule; (3) the public’s need for those products, and the probable effect of the rule on the utility, cost, and availability of the products; and (4) any means of reducing the risk of injury while minimizing adverse effects on competition or other commercial practices. Id

Second, the Commission must balance the costs and benefits identified in its findings to determine whether a safety standard is justified. See 15 U.S.C. § 2058(f)(3). Specifically, the Commission can only promulgate a safety standard if it reaches and articulates four conclusions: (1) “that the rule ... is reasonably necessary to eliminate or reduce an unreasonable risk of injury”; (2) “that the ... rule is in the public interest”; (3) “that the benefits expected from the rule bear a reasonable relationship to its costs”; and (4) “that the rule imposes the least burdensome requirement which prevents or adequately reduces the risk of injury for which the rule is being promulgated.”7 See 15 U.S.C. § 2058(f)(3)(A), (B), (E), (F).

Overall, then, the determination “involves ‘a balancing test like that familiar in tort law: The regulation may issue if the severity of the injury that may result from the product, factored by the likelihood of the injury, offsets the harm the regulation imposes upon manufacturers and consumers.’ ” Southland Mower Co. v. Consumer Prod. Safety Comm’n, 619 F.2d *1148499, 508-09 (5th Cir. 1980) (quoting Aqua Slide ‘N’ Dive Corp. v. Consumer Prod-Safety Comm’n, 569 F.2d 831, 839 (5th Cir. 1978)); see also 15 U.S.C. § 2056(a) (requiring that “[a]ny requirement of such a [safety] standard shall be reasonably necessary to prevent or reduce an unreasonable risk of injury associated with such product”).

The Act provides that a court may not uphold a safety standard unless the Commission’s statutorily required findings and conclusions are “supported by substantial evidence on the record takén as a whole.” 15 U.S.C. § 2060(c). Substantial evidence is “such relevant evidence as a reasonable mind might accept as adequate to support a conclusion.”8 Fowler v. Bowen, 876 F.2d 1451, 1453 (10th Cir. 1989) (quoting Richardson v. Perales, 402 U.S. 389, 401, 91 S.Ct. 1420, 28 L.Ed.2d 842 (1971) (“The [Supreme] Court has adhered to that definition in varying statutory situations.”)); see also Am. Textile Mfrs. Inst., Inc. v. Donovan, 452 U.S. 490, 522-23, 101 S.Ct. 2478, 69 L.Ed.2d 185 (1981) (adhering to that definition when reviewing whether safety standards issued by the Occupational Safety and Health Administration were “reasonably necessary” under 29 U.S.C. § 652(8)). A court may “neither reweigh the evidence nor substitute [its] judgment for that of the agency.” Andalex Res., 792 F.3d at 1257 (quoting Branum v. Barnhart, 385 F.3d 1268, 1270 (10th Cir. 2004)). Nonetheless, “[t]he substantiality of evidence must take into account whatever in the record fairly detracts from its weight.” Norris v. NLRB, 417 F.3d 1161, 1168 (10th Cir. 2005) (internal quotation marks omitted).

II. The Commission’s findings

In this instance, the Commission’s rule-making analysis fails at the first step of the Act’s two-step process: the initial cost and benefit findings. Specifically, the Commission’s analysis neglected to address critical ambiguities and complexities in the data underpinning the Commission’s findings as to (1) the degree of the risk of injury caused by magnet sets, and (2) the public’s need for the sets and the rule’s effect on their utility and availability, see 15 U.S.C. § 2058(f)(1)(A), (C). As a result of those omissions, the Court is unable to ascertain whether the Commission’s findings meet the substantial evidence standard—let alone to proceed to the next step of reviewing the Commission’s balancing of the safety standard’s costs and benefits.

1. Risk of injury

After analyzing a nationwide sampling of emergency room injury reports, the Commission estimated that the final rule would prevent approximately 900 magnet set-ingestion injuries annually, for a savings of $28.6 million. See Final Rule: Safety Standard for Magnets, 79 Fed. Reg. at 59,978-80; 16 C.F.R. § 1240.5(e)(2), (3). The Commission’s benefit analysis, however, gives short shrift to two aspects of the injury data set that cast doubt on the Commission’s findings.

*1149The first problem stems from the data set’s time frame. In performing its cost-benefit analysis, the Commission chose to rely on data spanning January 2009 through June 2012. See Final Rule: Safety Standard for Magnets, 79 Fed. Reg. at 59,978-80; 16 C.F.R. § 1240.5(e)(2), (3). But that data set does not reflect the subsequent significant, market changes triggered by the Commission’s compliance activities beginning in May 2012. As of July 2012, ten of the thirteen largest distributors had agreed, “at [Commission] staffs request,” to stop selling and start recalling magnet sets; by December 2012, the dominant firm in the market had ceased operating. Id. at 59,964, 59,978. Sales of magnet sets dropped commensurately (in the Commission’s words, they dropped “dramatically”). See id. at 59,978 (“[A]s a result of these actions and events, sales of the subject magnet sets currently are dramatically lower than they were at the time of the enforcement actions.”); 16 C.F.R. § 1240.5(b) (estimating that magnet set sales, which totaled 2.7 million from 2009 to 2012, dropped to fewer than 25,000 per year after 2012).

As might be expected, injuries associated with ingestion of magnets from magnet sets also declined. According to the Commission’s calculations regarding the eighteen months following June 2012, the estimated number of emergency room visits due to magnet sets dropped by about 100 incidents a year.9 See 16 C.F.R. 1240.5(a). Inasmuch as the Commission estimated the expected useful life of magnet sets to be about one year, the injury rates appeared poised to continue to drop. See Final Rule: Safety Standard for Magnets, 79 Fed. Reg. at 59,982. Indeed, the number of incidents reported directly to the Commission receded from 52 in 2012, to 13 in 2013, to only 2 in 2014. See id. at 59,962.

The Commission recognized that the decrease in injuries was “[l]ikely due to [Commission] enforcement and regulatory activity beginning in mid-2012.” Id. It appears that the Commission’s regulatory activity was predicated at least in part on enforcing the preexisting toy standard, see id. at 59,962, 59,978 n.14 (incorporating administrative complaint by reference), which prohibits designing and marketing magnet sets to children, see ASTM F963-11 §§ 3.1.37, 4.38-4.38.1. Most.of the pre-enforcement reported and estimated injuries concerned young children. See id. at 59,964 (stating that eighty-seven of the 100 incidents reported directly to the Commission concerned children younger than twelve years old, and 65% of the estimated injuries involved children between four and twelve years old). The Commission’s benefits findings, however, do not adequately account for the reduced injury rate (and therefore reduced need for a new standard) resulting from its recent apparent enforcement of the existing safety standard addressed specifically to toys and children.

In general, where there is a known and significant change or trend in the data underlying an agency decision, the agency must either take that change or trend into account, or explain why it relied solely on data pre-dating that change or trend. See, e.g., Cty. of Los Angeles v. Shalala, 192 F.3d 1005, 1020-22 (D.C. Cir. 1999) (remanding a Medicare rate-setting *1150for the agency to explain why it relied on data collected under its former payment regime, where more recent data collected under its current regime showed a marked downward trend in relevant hospital discharge times); Seattle Audubon Soc. v. Espy, 998 F.2d 699, 708-04 (9th Cir. 1993) (finding that an agency preparing an environmental impact statement erred in failing to address an intervening, independent report indicating that an endangered species’ “population [wa]s declining more substantially and more quickly than previously thought”).10

' Since agencies “have an obligation to deal with newly acquired evidence in some reasonable fashion,” Catawba Cnty. v. EPA, 571 F.3d 20, 45 (D.C. Cir. 2009), or to “reexamine” their approaches “if a significant factual predicate” changes, Bechtel v. FCC, 957 F.2d 873, 881 (D.C. Cir. 1992), an agency must have a similar obligation to acknowledge and account for a changed regulatory posture the agency creates— especially when the change impacts a contemporaneous and closely related rulemaking.”

Portland Cement Ass’n v. EPA, 665 F.3d 177, 187 (D.C. Cir. 2011) (holding that, before issuing a new rule based on the predicted emissions of certain pollutant sources, an agency should have considered the effect that a parallel pending rulemak-ing would have on those same emissions). “The refrain that [an agency] must promulgate rules based on the information it currently possesses simply cannot excuse its reliance on that information when its own process [may have] rendered] it irrelevant.” Id.

Here, the downward trend in injury rates is obvious; and appears to speak directly to the question Of whether the new rule is “reasonably necessary.” 15 U.S.C. § 2056(a). Yet the Commission offered no explanation or rationale' for its apparent assumption that the Observed reduction in injury rates would not endure. Rather, in addressing its decision to rely solely on pre-enforcement injury data, the Commission stated only:

Because [Commission] compliance actions have significantly altered the state of the market, the environment before these actions occurred represents the best approximation of how the market would have operated in the absence of [Commission] intervention and is the appropriate reference baseline for evaluating the impact of the rule.

Final Rule: Safety Standard for Magnet Sets, 79 Fed. Reg. at 59,978. That conclu-sory statement is insufficient to fulfill the Commission’s duty to explain why the downward trend in post-enforcement injury rates was not relevant to its evaluation of the benefits of the new rule. See Portland Cement, 665 F.3d at 187; Shalala, 192 F.3d at 1020-21.

This Court stands ready and willing to defer to agency expertise and discretion, properly exercised. See Andalex, 792 F.3d at 1257. But “[i]t is not the role of the courts to speculate on reasons that might have supported an agency’s decision.” Encino Motorcars, LLC v. Navarro, — U.S. -, 136 S.Ct. 2117, 2127, 195 L.Ed.2d 382 (2016); see also Motor Vehicle Mfrs. Ass’n of U.S., Inc. v. State Farm Mut. Auto. Ins. Co., 463 U.S. 29, 43, 103 S.Ct. 2856, 77 L.Ed.2d 443 (1983) (“We may not supply a *1151reasoned basis for the agency’s action that the agency itself has not given.’ ”) (quoting SEC v. Chenery Corp., 332 U.S. 194, 196, 67 S.Ct. 1760, 91 L.Ed. 1996 (1947)). “Whatever potential reasons the [Commission] might have given, the agency in fact gave ... no reasons at all.” Id.

An agency may not simply ignore without analysis important data trends reflected in the record. See Portland Cement, 665 F.3d at 187; Shalala, 192 F.3d at 1020-21. To the extent' the Commission’s findings rely solely on pre-enforcement injury rates, the Commission must offer a credible record-supported explanation as to why those rates accurately reflect the benefits of the new rule.11

The second problem with the Commission’s injury findings arises from the imprecision of the injury report narratives. The Commission used a keyword search to identify magnet set-related injuries within a representative sample of emergency room reports. See Final Rule: Safety Standard’for Magnets, 79 Fed. Reg. at 59,978.. To the resulting injury count, the Commission applied a cost model to extrapolate the overall number of injuries nationwide. See id. at 59,979. We take issue not with the Commission’s methodology, but rather with the degree of uncertainty the Commission condoned when implementing it: According to the Commission, ninety percent of the injury reports on which it ultimately relied only “possibly” involved the subject magnets sets.12 See id. at 59,978, 59,980 (“[A]bout 90 percent of the cases upon which the table [estimating medical costs] was based were described as only possibly involving the. magnets of interest_”), 59,985 (“[TJhere was an annual average of about 929 medically attended magnet ingestions that were defined as at least ‘possibly of interest’ during the period from 2009 through June 2012.”).13

The Act provides that the Commission cannot promulgate a safety standard unless it concludes “that the rule ... is reasonably necessary to eliminate or reduce an unreasonable risk of injury.” 15 U.S.C. *1152§ 2058(f)(3)(A). Underlying findings that peg the risk of injury as a mere “possibility” provide the Court no assistance in assessing that conclusion. See Gulf S. Insulation v. Consumer Prod. Safety Comm’n, 701 F.2d 1137, 1148 (5th Cir. 1983) (finding the Commission failed to show an unreasonable risk of injury because the equivocal predictions that the increased cancer risk could be “up to 51 in a million,” and that “somewhat less than 20% of the population may respond” to the given toxicity level, “provides [the court] no basis for review”); Southland Mower, 619 F.2d at 510 (“Without reliable evidence of the likely number of injuries that would be addressed ..., we are unable to agree that this provision is reasonably necessary to reduce or prevent an unreasonable risk of injury.”). Almost anything is “possible.” Therefore, the Commission’s finding that 90% of the predicate injuries only “possibly” involved magnet sets provides the Court with little guidance as to where, on the spectrum from ninety to 900 annual injuries, the real injury rate lies.

We need not decide here what would be an acceptable degree of uncertainty in a benefits finding; it may vary depending on the inherent factual uncertainties in a given context. However, we are confident that mere possibility falls short of the appropriate standard. See Morall v. Drug Enf't Admin., 412 F.3d 165, 176 (D.C. Cir. 2005) (“Substantial evidence ‘means evidence which is substantial, that is, affording a substantial basis of fact from which the fact in issue can be reasonably inferred. Substantial evidence is more than a scintilla, and must do more that create a suspicion of the existence of the fact to be established.’ ”) (quoting NLRB v. Columbian Enameling & Stamping Co., 306 U.S. 292, 299-300, 59 S.Ct. 501, 83 L.Ed. 660 (1939); Greater Yellowstone Coal., Inc. v. Servheen, 665 F.3d 1015, 1028 (9th Cir. 2011) (“It is not enough for the [agency] to simply invoke ‘scientific uncertainty” to justify its action.”) (citing State Farm, 463 U.S. at 52, 103 S.Ct. 2856); Vera-Villegas v. I.N.S., 330 F.3d 1222, 1231 (9th Cir. 2003) (“[Conjecture is not a substitute for substantial evidence.”) (quotation omitted)).

While the Commission is certainly free to rely on the emergency room injury report data set, it may not do so in a way that cloaks its findings in ambiguity and imprecision, and consequently hinders judicial review. We leave it to the Commission to determine whether its methodology and data set can in fact support a higher standard. We find only that the Commission’s benefits statistics must instill in the Court a greater degree of confidence in their accuracy than is currently present here.14 In so holding, we offer no opinion on the number of injuries that would support issuance of a new magnet set safety standard.

2. The public’s need for magnet sets

Although the Commission’s evaluation of the costs of the rule to magnet *1153distributors was adequate, its evaluation of the costs to consumers was incomplete. Specifically, the Commission failed to address an entire aspect of magnet sets’ utility—namely, the public’s need for the sets as scientific and mathematics education and research tools—and the rule’s probable effect on magnet sets’ availability and usefulness for those purposes. See 15 U.S.C. § 2058(f)(1)(C).

Numerous comments received by the Commission indicated that teachers and researchers use magnet sets to model and explain physics, biology, ■ and geometry concepts.15 The Commission’s findings, however, contain no substantive discussion of those uses. The Commission’s analysis does not examine how widespread the claimed uses are, or whether substitute products, such as larger magnetic spheres or alternative construction toys, are (or could be made) available to serve those uses. Instead, the Commission’s cost finding referred only to “some unknown quantity of lost utility.” 16 C.F.R. § 1240.5(e)(6), (h)(2).

Even though the task may be difficult, the Commission is required to advance some explanation that allows a reviewing court to evaluate whether the cost of the lost utility is in fact outweighed by the benefits of the rule. See Aqua Slide, 569 F.2d at 840 (“The Commission does not have to conduct an" elaborate cost-benefit analysis. It does, however, have to shoulder the burden of examining the relevant factors and producing substantial evidence to support its conclusion that they weigh in favor of the standard.”) (citation omitted). In this instance, the Commission abdicated that responsibility by failing to assess the demand for and usefulness of magnet sets as research and teaching tools. Without that information, the Court cannot accurately gauge the full costs of the safety standard.16 Cf. id at 839-40 (citing Forester v. Consumer Prod. Safety Comm’n, 559 F.2d 774, 790-91 (D.C. Cir. 1977) (remand*1154ing safety requirements for bicycles because the “Commission has evidently not considered the utility of specific items that will be prohibited by the regulations”)).

III. Zen’s remaining arguments

We find the remainder of Zen’s challenges to the rule unpersuasive. To begin, we do not reach Zen’s alternative argument that the safety standard is in effect a ban. Because we find the Commission’s underlying cost and benefit findings are inadequate, we have no cause to decide whether, in the next stage of cost-benefit balancing, the Commission would be required to meet the arguably higher standard applicable to bans. See 15 U.S.C. § 2058(f)(3)(C) (requiring, in the case of a ban, that the Commission find “that no feasible consumer product safety standard under this chapter would adequately protect the public from the unreasonable risk of injury associated with such product”).

Next, we find no merit in Zen’s contention that the Commission did not comply with the APA’s notice-and-comment procedures. See 5 U.S.C. § 553(b), (c). Zen complains it did not receive adequate notice of the scope of the final rule because it was not given an opportunity to comment on the Commission’s insertion of the phrase “or commonly used” into the final definition of subject magnet sets. See 16 C.F.R. § 1240.2(b) (“Magnet set means: Any aggregation of separable magnetic objects that is a consumer product intended, marketed or commonly used as a manipulative or construction item for entertain-ment_”) (emphasis added). However, “[i]t is a well settled and sound rule which permits administrative agencies to make changes in the proposed rule after the comment period without a new round of hearings.” Beirne v. Sec’y of Dep’t of Agric., 645 F.2d 862, 865 (10th Cir. 1981).

The primary limitation on that principle is that a final rule must be a “logical outgrowth” of the proposed rule. Am. Mining Cong. v. Thomas, 772 F.2d 617, 637 (10th Cir. 1985). “A final rule qualifies as a logical outgrowth if interested parties should have anticipated that the change was possible, and thus reasonably should have filed their comments on the subject during the notice-and-comment period.” CSX Transp., Inc. v. Surface Transp. Bd., 584 F.3d 1076, 1079-80 (D.C. Cir. 2009) (internal quotation marks omitted).

The Commission’s notice of proposed rulemaking expressly requested comments regarding the rule’s scope. See Proposed Rule: Safety Standard for Magnets, 77 Fed. Reg. at 53,788, 53,799. Moreover, the notice evinced the Commission’s concern that the proposed definition would not address the risks of magnets ostensibly marketed for purposes other than entertainment. See id. at 63,787. As such, it was reasonably foreseeable that the Commission would revise the definition to address that concern.

Moreover, the resulting revision was not “surprisingly distant” from the original definition, CSX Transp., 584 F.3d at 1080. Rather, the addition of the phrase “commonly used” is a logical outgrowth of the Commission’s original approach of targeting magnets according to their primary use. See Proposed Rule: Safety Standard for Magnets, 77 Fed. Reg. at 53,800 (original definition applying to magnets “intended or marketed ... primarily as” manipulative or construction entertainment items). The final rule confirms the purpose of the revision: “[It] seeks to prevent a manufacturer or importer of magnet sets from avoiding the rule by simply stating in marketing and other materials that the magnets are intended for uses other than those specified.” Final Rule: Safety Standard for Magnets, 79 Fed. Reg. at 59,973. Because the final definition of magnet sets is a logical outgrowth of the proposed defi*1155nition, the APA’s notice requirement was satisfied.

Finally, we decline to rely on either party’s letter purporting to alert the Court to supplemental authority. Neither letter is sanctioned by Federal Rule of Appellate Procedure 280, which permits a party to bring new legal authority—not new evidence—to the attention of the court. See Utah v. U.S. Dep’t of Interior, 535 F.3d 1184, 1195 n.7 (10th Cir. 2008). Moreover, both letters improperly invite the Court to review the safety standard on grounds and evidence that was not available to the Commission in promulgating the rule. See Fed. Power Comm’n v. Transcon. Gas Pipe Line Corp., 423 U.S. 326, 331, 96 S.Ct. 579, 46 L.Ed.2d 533 (1976) (“[W]e have consistently expressed the view that ordinarily review of administrative decisions is to be confined to consideration of the decision of the agency and of the evidence on which it was based.... The focal point for judicial review should be the administrative record already in existence,- not some new record made initially in the reviewing court.”) (internal alteration, quotation marks, and citations omitted); Custer Cnty. Action Ass’n v. Garvey, 256 F.3d 1024, 1027 n.1 (10th Cir. 2001) (“Judicial review of an agency decision is generally limited to review of the administrative record.”). Consistent with those conclusions, we do not consider either letter, and we DENY Zen’s motion to strike the Commission’s supplemental authority as moot.17

III. CONCLUSION

For the foregoing reasons, we VACATE and REMAND the Consumer Product Safety Commission’s magnet set safety standard, 79 Fed. Reg. 59,962 (codified at 16 C.F.R. §§ 1240.1-1240.5), to the Commission for further proceedings consistent with this opinion,

. Magnetic flux index is one way to estimate the attractive force of a magnet.

. However, small, powerful magnets can be included in “[h]obby, craft, and science kit-type items intended for children over 8 years of age,” so long as those items comply with certain safety labeling requirements. ASTM F963-11 4.38.3, S.17. Because neither party has cited or argued this aspect of the ASTM requirements on appeal, we do not address it further.

. This citation refers to the current version of the ASTM standard, which became effective June 2012. See Acceptance of ASTM F963-11 as a Mandatory Consumer Product Safety Standard, 77 Fed. Reg, 10,358, 10,358 (Feb. 22, 2012). The previous version, which was effective from August 2009 to June 2012, imposed the same requirements on children’s toy magnets. See id,; ASTM International Standard F963-8 Consumer Safety Specifications for Toy Safety §§ 3.1.33, 3.1.72, 4.38-4.38.1, and Fig. 3. In the interests of brevity and clarity, this opinion cites only to the current version of the ASTM standard.

.In cooperation with two distributors, the Commission also published a public service announcement regarding magnet sets’ dangers.

. The other two companies targeted by the Commission subsequently entered into settlement agreements to stop selling and to recall their products.

. The other difference between the two standards that figures into this litigation is the addition of the phrase “commonly used,” which enlarges the new rule in comparison with respect to the toy standard. That difference is discussed infra in Section II.C.

, Additional conclusions (which we need not address, see infra Section II.C) are required for safety bans and rules implicating existing voluntary standards. . See. 15 U.S.C. § 2058(f)(3)(C), (D).

. Contrary to Zen’s contention, although courts once assumed that the Act’s substantial evidence standard of review was more stringent than the APA’s arbitrary and capricious standard of review, see, e.g., Aqua Slide, 569 F.2d at 837, that view is no longer viable. Courts now recognize that, "[w]hen the arbitrary or capricious standard is performing th[e] function of assuring factual support, there is no substantive difference between what it requires and what would be required by the substantial evidence test.” Olenhouse v. Commodity Credit Corp., 42 F.3d 1560, 1575 (10th Cir. 1994) (quoting Ass’n of Data Processing Serv. Orgs., Inc. v. Bd. of Governors of Fed. Reserve Sys., 745 F.2d 677, 683-84 (D.C. Cir. 1984) (”[I]n their application to the requirement of factual support the substantial evidence test and the arbitrary or capricious test are one and the same.”)).

. That is, the Commission estimated that an average of 610 emergency room-treated injuries per year occurred during the three and a half years from January 2009 through June 2012. 16 C.F.R. § 1240.5(e)(2). But the Commission estimated that an average of only 580 emergency room-treated injuries per year occurred during the five years from January 2009 through December 2013. 16 C.F.R. § 1240.5(a). In order for the data collected during those last eighteen months to reduce the annual average by that amount, estimated emergency room injuries must have decreased by about 100 injuries per year.

. See generally Dist. Hosp. Partners, L.P. v. Burwell, 786 F.3d 46, 56-57 (D.C. Cir. 2015) C‘[A]n agency cannot ignore new and better data.”); Sierra Club v. U.S. EPA, 671 F.3d 955, 968 (9th Cir. 2012) ("[W]e. should not silently rubber stamp agency action that is arbitrary and capricious in its reliance on old data without meaningful comment on the significance of more current compiled data.”).

.The Commission contends Zen' forfeited the time frame argument by failing to raise it before the Commission during the notice-and-comment period. Generally, a party challenging an agency regulation must have initially presented its concerns "to the agency during the rulemaking process in order for a reviewing court to consider those concerns. See Nutraceutical Corp. v. Von Eschenbach, 459 F.3d 1033, 1041 n.9 (10th Cir. 2006). There are, however, exceptions to that requirement.

Claims not raised before an agency are not waived if "the problems underlying the claim are 'obvious.' ” Forest Guardians v. U.S. Forest Serv., 495 F.3d 1162, 1170 (10th Cir. 2007) (quoting Dep't of Transp. v. Pub. Citizen, 541 U.S. 752, 765 (2004)); see also Sierra Club, Inc. v. Bostick, 787 F.3d 1043, 1048 (10th Cir. 2015) (same). Because the Commission’s own calculations show a marked reduction in post-enforcement injury rates, the potential problems with that assumption were obvious. See Forest Guardians, 495 F.3d at 1170. Accordingly, we are not prohibited from reaching Zen's time frame argument.

. Specifically, of the eighty-six injuiy reports on which the Commission based its benefits finding, only nine definitively involved the subject magnet sets (as evidenced by brand names mentioned in the injuiy reports). See Final Rule: Safety Standard for Magnets, 79 Fed. Reg. at 59,978. The remaining seventy-seven injuries "were determined possibly to have involved the magnets of interest” based on a keyword search. Id. (including injury reports with keywords and phrases such as "high-powered,” "magnetic ball,” "magnetic marble," "BB size magnet,” and "magnet beads”).

. Although a greater percentage of the incidents reported directly to the Commission can be reliably traced to subject magnet sets, 16 C.F.R. § 1240.5(a), the Commission specifically found that those anecdotal incidents could not be used to estimate nationwide injuries, Final Rule: Safety Standard for Magnets, 79 Fed. Reg. at 59,969,

. The Commission attempted to bolster its injury count by pointing out that (1) some injury reports that did not alert to the keyword search may have in fact involved magnet sets and (2) some medical experts have opined that the available medical research undercounts injuries associated with magnet sets. See Final Rule: Safety Standard for Magnet Sets, 79 Fed. Reg. at 59,966, 59,980. Those two facts suggest that the true injury count may be higher than the Commission’s estimate. The Court, however, takes issue with the uncertainty of the Commission’s estimate—not with its magnitude. Inasmuch as those two facts are subject to their own reliability concerns (namely, imprecise report narratives cannot be traced to magnet sets, the experts did not quantify the degree to which they believe injuries are undercounted), they do not assuage the Court’s concerns about the accuracy of the Commission’s estimate.

. See, e.g., AR 911 ("As a physicist, graduate instructor, and Los Alamos National Laboratory employee, I wish to strongly oppose the ban on magnetic toys. These magnet sets are of tremendous educational values [sic], and I have used them in the classroom as well as at scientific community outreach events.”); 1410 ("As a practicing physicist, I have used these magnets for experimental and demonstrative purposes, and they are very effective tools.”); 1601 ("I am a high school biology teacher and I use magnets such as these as an invaluable teaching tool when discussing proteins structure and function.... It would be a great hardship to me arid to the education of my students if these tools were no longer available,”), 1696 ("I do research into geometric lattice theory with these mini-magnets ... [now] I can no longer take advantage of a powerful tool”), 3704 (high school math department chair stating;'“The Buckyball Science toy [ (a certain brand of magnet set) ] has been a remarkable teaching and learning tool in our home and in the classroom.”).

. The Commission does not contend that it would not enforce the rule,, which applies to magnets "intended, marketed or commonly used ... for entertainment,” 16 C.F.R. § 1240.2(b), against magnets also used for teaching or research. Although the Commission’s regulatory analysis does not quantify what amount of use it considers “common,” it does make clear that entertainment use by some consumers could trump both a distributor's stated intentions and alternative uses by other consumers. See Final Rule; Safety Standard for Magnets, 79 Fed. Reg. at 59,973 ("Common uses may be indicated by information found in consumer reports to the [Commission], firm reports to the [Commission], injury reports, and consumer comments/reviews posted on product Web sites stating that a product, regardless of whether it is intended or marketed by the manufacturer as such, was, in fact, being used as a manipulative or construction item for entertainment, such as puzzle working, sculpture building, mental stimulation or stress relief.”) (emphasis added). Accordingly, the Commission should have considered all substantial uses— including research and education—that could be foreclosed by full enforcement of the rule.

. Of course, on remand, the Commission is free to consider that evidence if it reopens the administrative record to conduct additional fact-finding.