OPINION AND ORDER
SHIRA A. SCHEINDLIN, District Judge.I. INTRODUCTION
Gucci America, Inc. (“Gucci”) brings this action against Guess?, Inc., Marc Fisher Footwear LLC, the Max Leather Group/Cipriani Accessories, Inc., Sequel AG, K & M Associates L.P., Viva Optique, Inc., Signal Products, Inc, and Swank, Inc. (collectively, “Guess”), alleging various violations of the Lanham Act,1 as well as related New York state law.2 Currently before the court are cross-motions to exclude expert survey reports. Specifically, Gucci seeks to exclude the surveys conducted by Dr. Myron J. Helfgott and Dr. Carol A. Scott,3 while Guess seeks to exclude the surveys conducted by Dr. Michael Rappeport, Dr. Michael B. Mazis, and Mr. George Mantis.4 At a hearing on August 4, 2011, I excluded the reports of Mr. Mantis, Dr. Rappeport, and Dr. Scott. I also limited the admissibility of Dr. Helfgott’s report to the issue of point-of-sale confusion.5 By letter, Gucci requested *727that I reconsider these rulings.6 For the reasons given below, both motions are now granted in part and denied in part.
II. BACKGROUND
A. The Gucci Surveys
Gucci has submitted expert reports on two substantive surveys. The first, conducted by Mr. Mantis, deals with post-sale confusion, whereas the second, conducted by Dr. Mazis, deals with association in the context of dilution. Guess argues that both of the surveys are so methodologically unsound that they must be excluded.7
1. The Mantis Survey
a. Design and Operation
Mr. Mantis conducted a so-called “Eveready” survey,8 and attempted to measure consumer confusion between “a Guess cross-body bag that is beige in color, and bears a repeating diamond-shaped pattern with the letter ‘G’ in the corners of the diamonds, and ... the Guess ‘Quattro G Design’ ... in the center of each diamond” with Gucci’s “Diamond Motif Trade Dress.”9 Respondents were shown photographs of either a test bag or a control bag and then asked about source, connection/affiliation, and approval/sponsorship.10 The test bag was a Guess “Citizen G” cross-body bag designed for men, which Mr. Mantis modified so that the center brown-red-brown stripe was solid brown. The control bag was the same size and shape as the test bag, but modified to have a blue background, a solid blue center stripe, and the Guess “Quattro G” pattern turned 45 degrees and arranged in horizontal and vertical rows.11
[[Image here]]
Respondents were told to look at their assigned photographs as if they saw someone wearing the bag in passing.12 After taking the photographs away, the interviewer asked the following questions: (1) “What company do you think makes or puts out the bag shown in the photographs?” (2) “Do you think that the com*728pany that makes or puts out the bag shown in the photographs makes or puts out any other brands?” (3) “Do you think that the company that makes or puts out the bag shown in the photographs is or is not connected to or affiliated with any other company or brand?” and (4) “Do you think that the bag shown in the photographs is or is not made or put out with the approval or sponsorship of any other company or brand?” If respondents gave an answer to any of these questions, the interviewer would ask one of the following questions: (1) “What other company or brand?” or “What other brands do you think are made or put out by that company?” or “With which other company or brand?” (2) “What makes you say that?” (3) “What do you mean by that?” (4) “Anything else?” and finally (5) “What do you mean by that?”13
b. Results and Coding Respondents were coded into groups depending on the brands they mentioned when answering substantive questions (1) through (4). Respondents who answered “Gucci” were coded as “confusion responses,” and respondents who answered “Gucci” and something else or “maybe Gucci” were coded as “qualified confusion responses.” Respondents who gave any other brand were coded as “other responses.” 14 Confused respondents were further grouped into two categories — “appearance-related reasons” and “other reasons”— based on the reasons they gave for their confusion.15 Using this system, Mr. Mantis found that 48 of 199 test group respondents, or 24.1 percent, were confused for appearance-related reasons, and that 17 of 201 control group respondents, or 8.5 percent, were confused for appearance-related reasons. Accordingly, Mr. Mantis found that a net total of 15.6 percent of the test group respondents were confused for appearance-related reasons,16 and concluded that “the Guess cross-body bag is likely to cause confusion with Gucci’s Diamond Motif Trade Dress.”17
c. Rebuttal to the Mantis Survey
Guess proffers an affidavit from its own expert, Dr. Shari S. Diamond, to challenge the Mantis Survey.18 Dr. Diamond’s affidavit, as well as Gucci’s response to it, are considered in Part IV below.
2. The Mazis Survey
a. Design and Operation
Dr. Mazis also conducted an “Eveready” survey, which he designed solely “to assess the degree of association, if any, between Guess’ use of the diamond motif and Gucci.” 19 Respondents were shown photographs of either the test bag or the control bag, and asked what other product or brand came to mind “based on the overall appearance” of that bag. The test bag was an unmodified Guess “Basique Bowler” bag that used the allegedly infringing trade dress. The control bag was the *729same size and shape as the test bag, with the following differences: the nameplate text was changed from script to block letters; the background color was changed from beige to blue; the dashed lines and “G’s” in the corner of the diamonds were removed; and the Quattro G pattern was rotated and arranged in horizontal and vertical lines.20
[[Image here]]
Respondents were given two photographs of their assigned bag — one of the front, one of the back — and instructed to look at them “as if you saw someone carrying [the bag].” After being told not to guess, they were asked, “If you have an opinion, does or doesn’t any other product or brand come to mind when you look at the overall appearance of this handbag?” Respondents who answered positively were then asked, “What other product or brand comes to mind when you look at the overall appearance of this handbag? Any others?” Finally, for each brand or product mentioned, the respondents were asked “"Why do you say that (PRODUCT OR BRAND MENTIONED) comes to mind when you look at the overall appearanee of this handbag? Any other reasons?” 21
b. Results and Coding
Like Mr. Mantis, Dr. Mazis coded respondents into categories depending on whether they gave appearance-related or non-appearance-related explanations for their answers. He found that 42 of 203 respondents in the test group, or 20.7 percent, and that 18 of 206 respondents in the control group, or 8.7 percent, felt Gucci came to mind for “relevant appearance-related reasons.” Accordingly, he concluded that a net of “12 [percent] of respondents associated the Guess Basique Bowler handbag with Gucci, a net result well in excess of association with any other *730brand,”22 and that “consumers associate the diamond motif pattern appearing on the Guess Basique Bowler handbag with Gucci.”23
c. Rebuttal to the Mazis Survey
Guess proffers the above-mentioned affidavit of Dr. Diamond to challenge the Mazis Survey. That affidavit, along with other relevant material, is discussed in Part IV below.
B. The Guess Surveys
Guess offers three surveys designed to measure point-of-sale confusion, two by Dr. Helfgott and another by Dr. Scott. Gucci seeks to exclude all of them as irrelevant to the matter of post-sale confusion.24
1. The Helfgott Surveys
Dr. Helfgott conducted two surveys. One survey used the Guess “Osaka” bag as the test bag (“Osaka Survey”), while the other used the Guess “Daisy Logo” bag (“Daisy Logo Survey”). Both were designed to “determine whether the Guess handbags selected were likely to cause consumers to mistakenly believe that the handbags were put out by, in association with, or with the approval of Gucci America.” 25
a. The Osaka Survey
i. Design and Operation
The purpose of the Osaka Survey was “to determine any potential confusion resulting from the use of Guess’s Quattro G Design in combination with brown-beige fabric.”26 Respondents were shown either a test bag or a control bag and asked about source, association and approval.27 The test bag was an unmodified Guess “Osaka” handbag. The control bag was identical to the test bag in every way except that it used a “pink-on-cream” color scheme.28
*731[[Image here]]
Respondents were given their assigned handbag and asked to “look it over as you would if you were in a store and were seriously considering buying it.”29 The interviewer then asked the following questions: (1) “What company do you think puts out this handbag ... or don’t you know?” (2) “Do you think the company that puts out this handbag puts it out themselves, or in association with some other company ... or don’t you know?” and (3) “Do you think the company that puts out this handbag did, or did not, need the approval of another company ... or don’t you know?” If respondents answered positively, the interviewer then asked the following questions: (1) ‘What in particular about this handbag makes you think (that)?” and (2) “Anything else?”30
ii. Results and Coding
Respondents were coded into two groups. If they mentioned Gucci in any way, they were coded as indicating a likelihood of confusion. If they mentioned Guess without mentioning Gucci, they were coded as indicating a correct identification. Dr. Helfgott considered all answers to all three substantive questions above in a single confusion analysis,31 and found that five percent of the test group and three percent of the control group indicated a likelihood of confusion.32 Accordingly, he concluded that there was “no [net] likelihood of confusion with regard to Guess handbags bearing the Quattro G design in combination with a brown-beige color fabric.”33
b. The Daisy Logo Survey
i. Design and Operation
The Daisy Logo Survey was designed “to determine any potential confusion resulting from Guess’s Quattro G Design, which consists of: Guess’s Quattro G mark ... surrounded by intersecting lines ... with single letter “G”s at the corners created by the line intersections.”34 Respondents were shown either a test bag or one of two control bags and asked about source, association and approval in a man*732ner identical to the Osaka Survey described above. The test bag was a Guess “Daisy Logo” handbag. The first control was identical to the Daisy Logo handbag, except that the “G”s in the corners of the intersecting lines were removed. The second control was similarly modified, except that the intersecting lines were also removed.35
[[Image here]]
*733[[Image here]]
ii. Results and Coding
Using the same coding method as he did in the Osaka Survey, Dr. Helfgott found that 1.5 percent of the test group respondents were confused, that 1 percent of the first control group were confused, and that 3 percent of the second control group were confused. Based on these results, he concluded that there was “no likelihood of confusion for either of the two conditions tested.”36
c. Rebuttal to the Helfgott Surveys
Gucci proffers the report of Dr. Itamar Simonson to challenge the Helfgott Surveys.37 That report, along with other relevant material, is discussed in Part IV below.38
2. The Scott Survey
a. Design and Operation
Dr. Scott conducted an “Eveready” survey that she designed “to determine the likelihood of confusion, if any, resulting from the use on a Guess leather belt of a Square G buckle.”39 Respondents were shown either a test belt or a control belt and asked about source, association, and *734approval.40 The test belt was a Guess belt with a Square G buckle, modified with a piece of tape to obscure the “Guess” name on the outside of the belt. The control belt was a Guess belt with a Round G buckle.41
[[Image here]]
Respondents were given as much time as they wanted to examine their assigned belt “as if they were considering purchasing it.” The interviewer then asked the following questions: (1) “What company or companies do you think puts out this belt ... or do you not know?” (2) “Do you think that the company or companies that put out this belt puts out themselves, or puts out in association with any other company or companies, ... or do you not know?” (3) “Do you think that the company that puts out this belt did or did not need the approval of any other company or companies, ... or do you not know?” Respondents who answered positively were asked for the names of the relevant company or companies, as well as their reasons for giving those names.42
b. Results and Coding
Dr. Scott separates her results on a question-by-question basis. In response to the first question, she found that 32 of 199 test group respondents, or approximately 16 percent, and 43 of 202 control group respondents, or approximately 21.3 percent, mentioned either Gucci or Gucci and Guess.43 The vast majority of these response were for appearance-related reasons, although Dr. Scott did not explicitly code them as such.44
In response to the third question, 2 of 199 test group respondents, or approximately 1 percent, and 3 of 202 of the control group respondents, or approximately 1.5 percent, believed that the company that put out the belt needed Gucci’s approval to do so.45 For the test belt, the explanations for these answers referenced the letter G, the material, or the general style of the belt, whereas for the control belt, the explanations referenced the logo, look, design, and style of the belt.46
*735Based on these results, Dr. Scott concluded that there was “no evidence that consumers uniquely associated the Square G design with Gucci,” and that “consumers are not likely to believe that Guess leather belts that have a Square G buckle are made by Gucci because of the particular shape of the G.”47 She also notes that far more respondents believed that Guess was the source of either belt, rather than Gucci.48
c. Rebuttal to the Scott Survey
Gucci also offers the above-mentioned report of Dr. Simonson to rebut the Scott Survey. It is discussed together with Guess’s sur-rebuttal in Part IV below.
III. APPLICABLE LAW
A. Infringement Actions Under the Lanham Act
1. Registered Trademarks
Section 32(1) of the Lanham Act prohibits the use in commerce of “any reproduction, counterfeit, copy, or colorable imitation of a registered mark in connection with the sale, offering for sale, distribution, or advertising of any goods or services on or in connection with which such use is likely to cause confusion, or to cause mistake, or to deceive.”49 The Lanham Act defines a “counterfeit” as a “spurious mark which is identical with, or substantially indistinguishable from, a registered mark.”50 Pursuant to 15 U.S.C. § 1057(b), “[a] certificate of registration of a mark upon the principal register ... shall be prima facie evidence of the validity of the registered mark and of the registration of the mark, of the registrant’s ownership of the mark, and of the registrant’s exclusive right to use the registered mark in commerce ....”51 Moreover, “[a] registered trademark becomes incontestable if it has been in continuous use for five consecutive years subsequent to its registration and is still in use.”52
Trademark infringement claims are “analyzed under [a] familiar two-prong test ....”53 This test “looks first to whether the plaintiffs mark is entitled to protection, and second to whether defendant’s use of the mark is likely to cause consumers confusion as to the origin or sponsorship of the defendant’s goods.”54 The latter inquiry “turns on whether ‘numerous ordinary prudent purchasers are likely to be misled or confused as to the source of the product in question because of the entrance in the marketplace of defendant’s mark.’ ”55 Additionally, a finding of infringement requires that there be “a ‘probability of confusion, not a mere *736possibility.’ ”56 Finally, “[t]he central consideration in assessing a mark’s protectability, namely its degree of distinctiveness, is also a factor in determining likelihood of confusion.”57
2. Trade Dress
The Lanham Act “has been held to embrace not just word marks ... but also ‘trade dress’ — a category that originally included only the packaging, or ‘dressing,’ of a product,”58 but now “encompasses the overall design and appearance that make the product identifiable to consumers.”59 Nonetheless, in order to prevent the law of trade dress from slipping into “protection for an [otherwise] unprotectable style, theme, or idea,”60 the Second Circuit requires that a plaintiff asserting trade dress rights offer “a precise expression of the character and scope of the claimed trade dress.”61 The plaintiff must also allege that the trade dress has acquired a “secondary meaning” that is distinctive as to the origin of the product; 62 that the trade dress is “not functional”; 63 and that the defendant’s use of a similar trade dress is likely to cause consumer confusion as to the origin of the product.64
3. Likelihood of Confusion
Courts in the Second Circuit apply the eight-factor balancing test introduced in Polaroid Corp. v. Polarad Electronics Corp. in determining whether there is a likelihood of confusion.65 The Polaroid factors are: (1) the strength of plaintiffs mark; (2) the similarity of plaintiffs and defendant’s marks; (3) the proximity of the products; (4) the likelihood that plaintiff will “bridge the gap”; (5) actual confusion between products; (6) defendant’s good or bad faith in adopting the mark; (7) the quality of defendant’s product; and (8) the sophistication of the buyers.66 “The application of the Polaroid test is ‘not mechanical, but rather, focuses on the ultimate question of whether, looking at the products in their totality, consumers are likely to be confused.’ ”67 “No single factor is dispositive, nor is a court limited to consideration of only these factors.”68 “Further, ‘each factor must be evaluated in *737the context of how it bears on the ultimate question of likelihood of confusion as to the source of the product.’ ”69
B. Admission of Expert Testimony
The proponent of expert evidence bears the initial burden of establishing admissibility by a “preponderance of proof.”70 Rule 702 of the Federal Rules of Evidence states the requirements for the admission of expert testimony as follows:
If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise, if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.
Under Rule 702 and Daubert v. Merrell Dow Pharmaceuticals, Inc., the district court must determine whether the proposed expert testimony “both rests on a reliable foundation and is relevant to the task at hand.”71 This means that the district court must act as “ ‘a gatekeeper to exclude invalid and unreliable expert testimony.’ ”72 In doing so, the court’s focus must be on the principles and methodologies underlying the expert’s conclusions, rather than on the conclusions themselves.73 “[T]he Federal Rules of Evidence favor the admissibility of expert testimony, and [courts’] role as gatekeeper is not intended to serve as a replacement for the adversary system.”74
In addition, Rule 403 of the Federal Rules of Evidence states that relevant evidence “may be excluded if its probative value is substantially outweighed by the danger of unfair prejudice, confusion of the issues, or misleading the jury.” “Expert evidence can be both powerful and quite misleading because of the difficulty in evaluating it. Because of this risk, the judge in weighing possible prejudice against probative force under Rule 403 ... exercises *738more control over experts than over lay witnesses.”75 Of course, expert evidence that is wholly irrelevant is inadmissible under Federal Rule of Evidence 402.
C. Survey Evidence
Factor Five of the Polaroid inquiry concerns “actual confusion” between products. While it is “self-evident that the existence of actual consumer confusion indicates a likelihood of consumer confusion,”76 it is also well-established that a plaintiff seeking to prevail under the Lanham Act need not prove the existence of actual confusion, “since actual confusion is very difficult to prove and the Act requires only a likelihood of confusion as to source.”77
Parties to trademark infringement actions frequently use consumer surveys to demonstrate or refute a likelihood of consumer confusion.78 Obviously, “[s]urveys do not measure the degree of actual confusion by real consumers making mistaken purchases. Rather surveys create an experimental environment from which we can get useful* data from which to make informed inferences about the likelihood that actual confusion will take place.”79
Reliance on expert studies is not without hazards. Indeed, “any survey is of necessity an imperfect mirror of actual customer behavior under real life conditions____It is notoriously easy for one survey expert to appear to tear apart the methodology of a survey taken by another.”80 Practically speaking, there is “no such thing as a ‘perfect’ survey. The nature of the beast is that it is a sample, albeit a scientifically constructed one.”81
To assess the validity and reliability of a survey, a court should consider a number of criteria, including whether:
(1) the proper universe was examined and the representative sample was drawn from that universe; (2) the survey’s methodology and execution were in accordance with generally accepted standards of objective procedure and statistics in the field of such surveys; (3) the questions were leading or suggestive; (4) the data gathered were accurately reported; and (5) persons conducting the survey were recognized experts.82
“[T]he closer the survey methods mirror the situation in which the ordinary person would encounter the trademark, the greater the evidentiary weight of the survey *739results.”83 The failure of a survey to approximate actual marketplace conditions can provide grounds for inadmissibility.84 Finally, while errors in survey methodology usually go to weight of the evidence, a survey should be excluded under Rule 702 when it is invalid or unreliable, and/or under Rule 403 when it is likely to be insufficiently probative, unfairly prejudicial, misleading, confusing, or a waste of time.85
IV. DISCUSSION
At the August 4, 2011 hearing in this matter, Gucci made it crystal clear that it was proceeding solely on a theory of post-sale confusion.86 Accordingly, determining whether a survey is relevant to the issue of post-sale confusion is a threshold matter that must be addressed before any others.
A. The Gucci Surveys
In its motion in limine, Guess argues that the Mazis Survey and the Mantis Suxvey both suffer from a number of flaws that, taken together, require each to be excluded under Federal Rules of Evidence 702 and 403. For the reasons given below, Guess’s motion is granted in part and denied in part.87
1. Motion in Limine to Exclude the Mazis Survey on the Issue of Dilution88
Guess argues that the Mazis Survey is flawed in two respects. First, it argues that Dr. Mazis used a flawed con*740trol. Second, it argues that Dr. Mazis improperly over-reported the degree of association that his survey measured. For the reasons given below, I find that both of these arguments go to weight, not admissibility, and that their cumulative effect is not so great as to require exclusion. Accordingly, I find that the Mazis Survey is admissible on the issue of dilution.
a. The Mazis Survey Did Not Use a Flawed Control
The first argument Guess raises is that the control in the Mazis Survey was flawed because it removed too many elements of the allegedly infringing trade dress at once.89 According to Guess, Dr. Mazis should have removed each allegedly infringing trade dress component one by one, so as to be able to determine if some resulting unprotectable combination of elements gave rise to a substantial portion of any association measured.90
The scientific literature cited in the briefs lends some support to this argument. It is true, for example, that a control that contains more elements of the allegedly infringing trade dress is “stronger” than a control that contains fewer such elements.91 It is also true that “multiple controls are generally, but not always, better than a single control.”92 However, while the fact that a survey used a control that could have been “stronger” or “better” may mean it is entitled to less weight, it does not mean that the survey does not provide relevant information.
Cumberland Packing Corp. v. Monsanto Co., which Guess cites in its brief, is not to the contrary.93 In that case, Judge Eugene Nickerson was critical of coding responses as “confused” when respondents mentioned something other than the elements of the allegedly infringing trade dress to explain their confusion.94 Nothing in his opinion, however, even hinted that a survey measuring confusion — or association — caused by an allegedly infringing multi-element trade dress requires the use of multiple controls. Indeed, at least one court, in a far more recent decision, has approved of a survey that used a single control that eliminated more than ten trade dress elements at once.95
In sum, neither science nor law mandate the exclusion of a survey that uses a single control to measure association attributable to an allegedly infringing multi-element trade dress. While such a survey may be entitled to less weight than one with closer-to-ideal controls, its relevance to the issue of association is not thereby eliminated for purposes of determining its admissibility. Accordingly, I find that Guess’s first argument, standing alone, does not affect the admissibility of the Mazis Survey.96
*741b. The Mazis Survey Did Not Improperly Inflate Association Levels
Guess next argues that the Mazis Survey improperly over-reported association by including respondents who explained their association by reference to either a single component of the allegedly infringing trade dress, or to something unrelated.97 According to Guess’s analysis of the verbatim survey responses, such coding led Dr. Mazis to over-report association by 4.5 percent.98
Reduced to its essentials, Guess’s argument here is that a survey respondent is properly coded as associating the allegedly infringing trade dress with Gucci if — and only if — she either explains her association by articulating every component of that trade dress, or by referencing the trade dress in a general manner. This argument is not totally off base. Clearly, a survey in which all respondents explained their conclusions in this idealized manner would be preferable to one in which few or no respondents did so. It is equally clear, however, that expecting survey respondents to be able to parse their thought processes with such a high degree of specificity in response to an open-ended “why-do-you-say-that” question is unrealistic.99
Instead, there will likely be a spectrum of response explanations. On one end, an explanation might be the ideal kind that Guess charges are necessary. On the other end, an explanation might not mention any component of the allegedly infringing trade dress. Most surveys, however, will have explanations spread out across this spectrum. With the understanding that the weight given to a survey will vary as the explanations tend towards one end or the other, most surveys should be admitted as relevant.
It is possible, of course, that a survey might be so filled with dubious explanations that it could rise to the level of being excludable as insufficiently probative under Rule 403, or even wholly irrelevant under Rule 402. Without determining exactly where those levels may lie, however, I find that the Mazis Survey does not reach them.
In sum, although the Mazis Survey is flawed in that it used a less-than-ideal control, and although one might wish that it contained more specific response explanations, these issues are relatively minor. Accordingly, while Guess is free to make any and all reasonable arguments as to the weight I should accord it on summary judgment, I find that the Mazis Survey should be admitted on the issue of associa*742tion, as that issue relates to the larger dilution analysis.
2. Motion in Limine to Exclude the Mantis Survey
The first two issues Guess raises against the Mantis Survey — that it used an improper control and that it improperly over-reported its confusion results — are identical to the issues it raised against the Mazis Survey. For the same reasons given above, I conclude that those issues go to weight, not admissibility. In addition to these issues, however, Guess also argues that the Mantis Survey is flawed because it failed to approximate actual marketplace conditions in two ways. First, Guess notes that the Mantis Survey used a modified test bag that never existed. Second, Guess notes that the test bag was not representative of the line of bags from which it was selected because it did not clearly bear the Guess name.100 Each of these points is addressed in turn.101
a. By Using a Modified Test Bag, the Mantis Survey Failed to Reflect Actual Marketplace Conditions
Gucci first argues that the use of a modified test bag in the Mantis Survey was “a conservative measure” consistent with “accepted industry practice”102 designed to isolate the effect of the allegedly infringing trade dress. In support of this point, Gucci points out that the Scott Survey, which Guess seeks to admit on its own behalf, also used a modified test product for much the same reason.103 Gucci does not cite any scientific literature to support its claim on this issue, however, and my own research has revealed none. Furthermore, according to Dr. Diamond, Guess’s rebuttal expert, Gucci’s explanation is “pure speculation.” Accordingly, Dr. Diamond concludes that “[w]e simply do not know how viewers would have responded to the real bag because the real bag was never tested.”104 I agree, and find that Gucci has not shown that survey industry practice justifies the use of a modified test bag.
Gucci does, however, cite cases from the Second Circuit and the Ninth Circuit that have found that this sort of modification goes to weight, not admissibility. Gucci argues that Lois Sportswear, U.S.A., Inc. v. Levi Strauss & Co.105 stands for the proposition that when a post-sale confusion survey modifies the normal post-sale appearance of its test product, any dispute over that goes to weight rather than admissibility.106
Gucci also cites adidas-Salomon AG v. Target Corp.107 and adidas America v. *743Payless Shoesource,108 both of which involved a survey where an element of the allegedly infringing trade dress was digitally removed from the test product in order to focus respondents’ attention on other factors. According to Guess, these cases are not relevant because the element that was removed was acknowledged to be independently famous.109 I find nothing in either of the cases to support Guess’s point; that is, neither case relies on the fact that the element that was digitally removed was independently famous in order to reach the conclusion that objections to the modification went to weight, not admissibility.110
Gucci does not dispute that the use of a modified test bag in the Mantis Survey constituted a failure to replicate actual marketplace conditions. The issue is simply whether that failure goes to the weight to be accorded to the survey, or whether, as Guess argues, it justifies excluding the Mantis Survey on the grounds that it “reveals nothing about any purported actual confusion in the real world.”111 Based on the discussion of the case law above, I find that, in all but the most extreme cases, a failure to replicate actual marketplace conditions goes to a survey’s weight, not its admissibility. While the use of a modified test bag here was a failure to reflect actual marketplace conditions, I conclude that, standing alone, it does not require exclusion of the Mantis Survey.
b. By Using a Highly Unrepresentative Test Bag, the Mantis Survey Failed to Reflect Actual Marketplace Conditions
According to Dr. Diamond’s analysis of the underlying sales data, more than 99 percent of bags bearing the allegedly infringing trade dress also bore “either the GUESS name on the front of the bag, or the GUESS name on large G-shaped hardware on the handbag, or both.”112 Guess argues that selecting a bag without the Guess name on it constituted a “deviation from actual marketplace conditions” that “precludes the [Mantis Survey’s] admission as providing reliable evidence of representative real world post-sale situations.”113 Guess cites several cases in support of this proposition. Two of them — Juicy Couture, Inc. v. L’Oreal USA, Inc.114 and this court’s decision in THOIP v. Walt Disney Co.115 are opinions from this district. Two other cases — American Footwear116 and Beverage Marketing USA, Inc. v. South Beach Beverage Corp.117— are Second Circuit opinions.
Guess’s reliance on my opinion in THOIP is somewhat misplaced. In that case, I noted that a failure to reflect actual marketplace conditions was a major flaw that “severely diminishe[d] the reliability and probative force” of the survey in question.118 However, I also noted that the survey in question failed to use an ade*744quate control.119 It was the combined effect of these major flaws, not the independent effect of either, that led me to exclude that the survey in question.120 Guess’s reliance on Juicy Couture is also not completely on point. The survey in that case completely failed to replicate real-world conditions with respect to packaging, branding, and location of sale. While Judge Denise Cote criticized the survey as one that would likely be inadmissible in a jury trial, she actually denied a motion in limine to exclude it.121 Accordingly, all that these two cases stand for is a proposition that is not in dispute: a survey’s failure to reflect actual marketplace conditions is a serious matter that may, in light of other factors, lead to the survey’s exclusion from evidence.
Finally, as noted above, American Footwear merely stands for the proposition that a district court may exclude a survey where its failure to replicate actual marketplace conditions is severe. Likewise, Beverage Marketing simply reiterates the point that a failure to reflect actual marketplace conditions reduces a survey’s reliability. In sum, while Guess’s case law supports the notion that a survey may be excluded if it egregiously deviates from actual marketplace conditions, nothing therein provides an obvious answer to the inherently discretionary question of whether to exclude the Mantis Survey.122
Gucci argues that it did not remove the Guess name from the Mantis test bag, and that, accordingly, the case law discussed above is simply not relevant. Indeed, it claims that “the Guess bags were deliberately positioned so the embossed Guess name on the strap could be seen by respondents.” 123 After examining the pictures at some length, I agree with Gucci that the Guess name was left on the product, albeit in a way that one is unlikely to find if not directed to look for it.
Guess’s argument, however, was not simply that the Mantis test bag failed to include the Guess name. Instead, Guess argued that the Mantis test bag was unrepresentative because it lacked permanent fixtures bearing the Guess name present on the vast majority of allegedly infringing bags that the Mantis test bag was to represent. Because Gucci provides no answer to this criticism, and because I find it to be convincing, I find that the Mantis test bag failed substantially to reflect actual marketplace conditions.124
*745c. Because of Its Flaws, the Mantis Survey Is Excluded
Having considered the primary challenges to the Mantis Survey, I must now decide if its flaws and weaknesses merit excluding it from evidence. As noted above, no survey is perfect, and most flaws go to weight, not admissibility. Nonetheless, I do not believe that the use of an unrepresentative test bag can be brushed aside in this way. Indeed, because the Mantis test bag was not representative of the line of allegedly infringing Guess bags, the Mantis Survey has little probative value on the issue of whether that line of handbags is likely to cause confusion in the post-sale marketplace. Accordingly, I find that this is one of the rare cases where a single — but extremely important — survey flaw supports the exclusion of the survey.
B. The Guess Surveys
1. Motion in Limine to Exclude the Helfgott
Surveys Gucci argues that the Helfgott Surveys suffer from a number of flaws that, taken together, require them to be excluded under Federal Rules of Evidence 702 and 403. For the reasons stated below, the Helfgott Surveys are excluded,
a. The Helfgott Surveys Are Point-of-Sale Surveys
Gucci argues that Dr. Helfgott’s survey controls were improper on several fronts, as well as claims that flaws in survey population led to under-reporting of the likelihood of confusion.125 Guess vigorously disputes this criticism, claiming that the controls were proper and that the survey universe was not over-inclusive. Furthermore, Guess argues that Gucci wrongly considers the Helfgott Surveys individually instead of reading them together.126
Dr. Simonson argues at great length that because of the vast differences between consumer behavior in the point-of-sale marketplace and the post-sale marketplace, and because of the methodological differences involved in assessing whether a consumer is confused in either situation, a survey whose stated purpose is to assess point-of-sale confusion will have nothing relevant to say about post-sale confusion.127 This analysis is convincing, and I agree with it. Accordingly, I find that methodological flaws in a point-of-sale confusion survey need not be discussed to exclude that survey as irrelevant to the issue of post-sale confusion. Guess does not dispute that the Helfgott Surveys were conducted with the purpose of measuring point-of-sale confusion. As such, the Helfgott Surveys must be excluded under Rule 402 as irrelevant to the extent that they are offered to counter Gucci’s theory of post-sale confusion.
Guess attempts to rescue the Helfgott Surveys by arguing that they are indeed relevant to post-sale confusion. The argument runs as follows:
[The Scott Study shows that] in a substantial number of post-sale scenarios, the GUESS name will be visible to consumers viewing Guess’s Quattro G handbags. The high incidence of correct Guess identifications in Dr. Helfgott’s survey demonstrates that when consum*746ers are able to observe the GUESS name, they are not confused. Consequently, in the significant numbers of post-sale settings in which the GUESS name is visible to consumers, no confusion is likely.128
This conclusion, however, cannot withstand close scrutiny. Common sense confirms what the Scott Study shows: most women carry their handbags with the ornamentation facing outward. Nonetheless, this does not mean that a casual observer in a post-sale environment will ever notice that ornamentation. The Helfgott Surveys at most show that consumers are not confused when they see the Guess name on a bag when they are “seriously considering buying it.” Because of the vast differences between the way that consumers examine products at the point of sale and the way they notice products in a post-sale environment, the fact that consumers are not confused when they see the Guess name on a bag at the point of sale does not indicate that they will not be confused when they see ornamentation (bearing the Guess name) in passing on a bag in a typical post-sale environment. Accordingly, I find that the Scott Study does not enable the Helfgott Surveys to provide relevant evidence regarding the issue of likelihood of confusion in the post-sale environment.
b. The Helfgott Surveys Are Not Admissible for Any Other Purpose
Apparently realizing that the Helfgott Surveys may be irrelevant to the issue of post-sale confusion, Guess argues that the Helfgott Surveys nonetheless provide evidence on two other issues: lack of consumer confusion and Guess’s good faith.129 For the reasons stated below, the Helfgott Surveys are not admissible on either point.
First, Guess argues that the Helfgott Surveys show that “when consumers come upon Guess products in the marketplace, they are able to readily and correctly identify those products as coming from, and only being associated with Guess,” and are therefore “directly relevant to the likelihood of confusion analysis required by the Second Circuit’s Polaroid decision.”130 This argument is essentially a repeat of the argument involving the Scott Survey. As noted above, however, the Helfgott Surveys do not show that consumers are not confused when they come across Guess products in typical post-sale environments, where they are unlikely to invest any substantial time or effort in examining the products they see. Because Gucci has clearly stated that it is only pursuing claims based on post-sale confusion, the fact that the Helfgott Surveys may show that consumers are not confused in the wholly different point-of-sale marketplace is simply not relevant.
Second, Guess argues that because the Helfgott Surveys show that consumers are not confused between Gucci and Guess handbags in point-of-sale situations, they also show that Guess lacked the intent to confuse consumers. Accordingly, Guess argues, the Helfgott Surveys provide evidence that the sixth Polaroid factor— which asks about Guess’s good faith or bad faith in adopting the trade dress in ques*747tion — should weigh in its favor. I note, however, that confusion in the post-sale environment is completely different from point-of-sale confusion. In the post-sale environment, the concern is that the public in general will be deceived into thinking that the allegedly infringing product is authentic, and that some consumers interested in purchasing the authentic product will instead choose to purchase the allegedly infringing product, on the grounds that they can obtain the same prestige for less money. That is, the harm in post-sale confusion is that potential purchasers will knowingly choose the infringing product over the authentic one in order to obtain the status of the latter at the price of the former. Thus, even if the Helfgott Surveys could show that Guess did not intend to confuse consumers at the point of sale, they provide no evidence relevant to the question of whether or not Guess intended to confuse the public into believing that women carrying inexpensive Guess bags appeared to be carrying expensive Gucci bags. Accordingly, I reject this argument as well.
2. Motion in Limine to Exclude the Scott Survey
At the last hearing on this matter, I expressed my concern that the Scott Survey used a modified test product that never existed in the marketplace, and concluded that it should be excluded for that reason.131 As I noted in my discussion of the Mantis Survey, however, I now find that this flaw alone will typically not justify excluding a survey. Guess now raises several points on which the Scott Survey may provide relevant evidence, which I discuss in turn.
a. The Scott Survey Is Not Admissible on the Issue of Consumer Confusion
The Scott Survey was designed to measure point-of-sale confusion.132 As noted in the discussion of the Helfgott Surveys above, the differences between point-of-sale confusion and post-sale confusion are such that a survey designed to measure the former is irrelevant to the latter. Because Gucci’s only remaining theory of consumer confusion is a post-sale theory, the Scott Survey is inadmissible on that issue.
b. The Scott Survey Is Relevant to Guess’s Laches Defense
As with the Helfgott Surveys, Guess claims that the Scott Survey is relevant to issues beyond consumer confusion. The first of these claims is that the Scott Survey provides relevant evidence on the issue of laches. For the reasons explained below, the Scott Survey is admissible on this point.
Because the Lanham Act expressly incorporates the principles of equity, the equitable defense of laches is available in a trademark infringement action brought under the Lanham Act.133 As an equitable defense, however, it is also highly fact intensive and not typically amenable to summary judgment.134
In the context of a suit brought under the Lanham Act, the Second Circuit has stated that likelihood that the defense of laches will apply increases as the senior user tolerates “the junior user’s competition in the same market with a name simi*748lar to that of the senior user” over an extended period of time.135 Generally speaking, the court considers three factors in a laches analysis: (1) whether the senior user knew that the junior user was using its mark; (2) whether the senior user inexcusably delayed taking action; and (3) whether the junior user is harmed as a result.136
While the Scott Survey does not provide evidence relevant to post-sale confusion, it does provide evidence that consumers, at the point of sale, associate the “Square G” buckle with Guess more than any other brand, including Gucci.137 This in turn implies that Guess would be harmed if it were forced to stop using the “Square G” buckle as the result of the instant suit. Accordingly, the Scott Survey is relevant to the third element of the laches defense, and is admissible on that point if its methodology is sufficiently sound to avoid exclusion. I now turn to that issue.
Gucci next argues that “the high confusion level in the Scott Survey control ... group reflected real consumer confusion” rather than mere survey noise; accordingly, Gucci claims that the control was so unreliable as to throw the entire Scott Survey into question.138 Gucci relies primarily on U.S. Polo Association v. PRL USA Holdings, Inc. to support this claim.139 In that case, Judge Robert Sweet relied on the article by Dr. Jacob Jacoby cited earlier for the proposition that a survey control with high confusion rates is problematic.140 ■ Dr. Jacoby’s article, however, indicates that the primary focus should be on the net confusion level, and that a high confusion level in the survey control group goes to “the question of desirability” — that is, to weight, rather than admissibility.141
Gucci attempts to deflect this potentially damning statement by arguing — without support from either case law or the scientific literature — that Dr. Jacoby “obviously ... was not referring to a situation where the control was deliberately selected to offset the confusion level in the test cell.” 142 Even if this argument is accepted, it would only show that the Scott Survey was flawed if Dr. Scott indeed selected her survey control in such a manner. Gucci argues that Dr. Scott did just that by intentionally selecting a G shape that she knew was similar to the shapes used in Gucci’s double G designs.143 Guess responds by arguing that Gucci cannot claim ownership over all rounded G shapes, and that, in any event, the G shape that Dr. Scott used as her control was not confusingly similar to those used by Gucci.144
Having examined the double G designs that are the basis of Gucci’s argument, I find that the G referenced by Gucci is not confusingly similar to Dr. Scott’s control G shape. The former is approximately as tall as it is wide, whereas the latter is *749substantially wider than it is tall. Any similarity between them is attributable to the simple fact that they are both G’s.
[[Image here]]
I therefore find that Dr. Scott did not use-intentionally or otherwise-a control buckle that was confusingly similar to the Gucci G shapes mentioned above. Accordingly, I reject Gucci’s argument that the Scott Survey used an inadequate control, and hold that it is admissible on the third factor of the laches analysis for the reasons already given.
c. The Scott Survey Is Not Relevant to Show that the Gucci’s Square G Mark Is Weak
Because the strength of a mark is relevant under Polaroid to determining whether there is a likelihood of confusion, Guess argues that the Scott Survey is relevant in that it shows that Gucci’s “Stylized G” mark is weak because the level of association with Gucci it measured — 18.6 percent — is insufficient to establish secondary meaning.145 The problem with this argument — as Gucci correctly points out146 —is that confusion surveys and secondary meaning surveys are designed differently in two key respects. First, confusion surveys and secondary meaning surveys do not measure the same universe of respondents.147 Second, confusion surveys ask respondents about the source of the junior user’s product, whereas secondary meaning surveys ask respondents about the source of the senior user’s product.148 Accordingly, drawing a secondary meaning inference, even from a perfect confusion survey, is improper. As such, the Scott Survey, which is clearly a point-of-sale confusion survey,149 cannot be used to support Guess’s “weak mark” hypothesis in this instance.
V. CONCLUSION
For the reasons given above, Gucci’s motion is granted in part and denied in part as follows: the Helfgott Surveys are excluded on all of the issues for which Guess offers them. The Scott Survey is admissible on the issue of laches, but, like the Helfgott Surveys, excluded on the re*750mainder of the issues for which Guess offers it.
For the reasons given above, Guess’s motion is granted in part and denied in part as follows: the Mantis Survey is inadmissible on the issue of post-sale confusion. The Mazis Survey is admissible on the issue of association as it relates to dilution. The Clerk of the Court is directed to close these motions (Docket Nos. 162 and 163). A hearing is scheduled for December 2, 2011 at 4:30 p.m.
SO ORDERED.
. See Second Amended Complaint at 18-22.
. See id. at 22-24.
. See Gucci’s Memorandum of Law in Support of Its Motion in Limine to Exclude Guess's Proposed Expert Opinions, Testimony, and Surveys of Dr. Myron J. Helfgott and Dr. Carol A. Scott ("Gucci Mem.”).
. See Guess's Memorandum of Law in Support of Its Motion in Limine to Exclude the Surveys of Dr. Michael Rappeport, George Mantis, and Dr. Michael B. Mazis ("Guess Mem.”).
. See 8/4/11 Hearing Transcript ("Tr.") at 22:14-25.
. See 8/9/11 Letter from Gucci to the Court at 2.
. See Guess Mem. at 2-3. Guess also seeks to exclude Dr. Rappeport's report, which was offered to rebut Dr. Helfgott’s survey. See id.
. See 6 McCarthy on Trademarks § 32:174 at 32-367 to 32-369 (describing the Eveready format and its namesake Union Carbide Corp. v. Ever-Ready, Inc., 531 F.2d 366 (7th Cir.1976)).
. 5/25/11 Likelihood of Confusion Study Prepared by the Mantis Group, Inc. ("Mantis Survey”), Ex. 1 to Declaration of Robert C. Welsh, Attorney for Guess ("Welsh DecL”), at 1.
. See id. at 6.
. See id. at 3-4.
. See id. at 5.
. See id. at 5-6. Mr. Mantis also used screening questions to ensure that he interviewed people over the age of eighteen who were likely to buy a cross-body bag priced at seventy-five dollars or more in the next twelve months. Furthermore, he eliminated anyone who was connected to the retail, advertising, or survey industries, who participated in a marketing survey in the last twelve months, or who was lacking her normal corrective eyewear. Finally, he used a quota sampling method to ensure that respondents were chosen for participation in proportion to their qualification rates, accounting for gender and age. See id. at 2-3.
. Id. at 8-11.
. Id. at 9.
. See id. at 11.
. Id. at 13.
. 6/27/11 Affidavit of Dr. Shari S. Diamond ("Diamond Aff.”), Ex. 5 to Welsh Deck
. 5/25/11 Expert Report of Michael B. Mazis, Ph.D. ("Mazis Survey”), Ex. 2 to Welsh Deck, at 3.
. See id. at 5.
. Id. at 8-9. Dr. Mazis also used screening questions to ensure that survey respondents were at least eighteen years old and likely to spend one hundred dollars or more on a handbag in the next six months. He also excluded people who worked in the malls where respondents were interviewed or who had participated in a market research survey in the previous three months. Finally, like Mr. Mantis, he excluded respondents who were connected to the retail, advertising, or survey industries, or who did not have their normal eyewear with them. See id. at 8. Dr. Mazis did not use a quota sampling method.
. Id. at 9-10.
. Id. at 10-11.
. See Gucci Mem. at 1. See also Gucci’s Reply Memorandum of Law in Further Support of Its Motion to Exclude Guess’s Proposed Expert Opinions, Testimony, and Surveys of Dr. Myron J. Helfgott and Dr. Carol A. Scott (“Gucci Rep. Mem.”) at 5 n. 9 (explaining that Gucci seeks exclusion of all of the Guess surveys).
. Expert Report of Myron J. Helfgott, Ph.D.: The Results of Two Likelihood of Confusion Surveys Concerning Gucci’s Claimed Trade Dress and GuessP's Quattro G Design ("Helfgott Report”), Ex. A to Gucci’s Notice of Motion, at 6.
. Id.
. See id. at 11.
. Id. at 7. Dr. Helfgott used screening questions to ensure that all respondents were females at least eighteen years old who had purchased within the past twelve months, or planned to purchase within the next twelve months, a handbag costing at least forty dollars. See id. at 10. He also created two age groups — eighteen to thirty-nine and forty and over — and selected equal numbers of respondents from each. See id. at 9.
. Id. at 11 (quotation marks omitted).
. Id.
. See id. at 12.
. See id. at 13.
. id.
. Id. at 6.
. See id. at 7-8.
. Id. at 14.
. See 6/27/11 Rebuttal Expert Report of Dr. Itamar Simonson, Ex. C to Gucci's Notice of Motion (“Simonson Report”).
. Gucci also offers the report of Dr. Rappeport to challenge the Helfgott Surveys. However, because the Helfgott Surveys are inadmissible on the very point that Dr. Rappeport challenges, his report need not be considered.
. 5/15/11 Expert Report of Professor Carol A. Scott ("Scott Report”), Ex. B. to Gucci's Notice of Motion at 3-4.
. See id. at 7.
. See id. at 5-6. Dr. Scott used screening questions to ensure that all respondents were females at least eighteen years old who had purchased within the past twelve months, or planned to purchase within the next twelve months, a belt costing at least thirty-five dollars. She also used a quota sampling method to ensure that the age distribution of the respondents matched that of the general female population of the United States. Finally, she used standard exclusion questions similar to those used by Dr. Mazis. See id. at 4-5.
. See id. at 6-7.
. See id. at 8.
. See id. at 11. Notably, none of the test group respondents believed that the company that put out the test belt did so in association with Gucci. Only one control group respondent believed that the company that put out the control belt did so in association with Gucci. See id. at 9-10.
. See id. at 10.
. See Ex. 10 to the Scott Report at 13-14.
. Scott Report at 12.
. See id.
. 15 U.S.C. § 1114(1).
. Id. § 1127.
. Id.§ 1057(b).
. Gruner + Jahr USA Publ’g v. Meredith Corp., 991 F.2d 1072, 1076 (2d Cir.1993).
. Virgin Enters. Ltd. v. Nawab, 335 F.3d 141, 146 (2d Cir.2003) (citing Gruner + Jahr USA Publ’g, 991 F.2d at 1074). Accord Starbucks Corp. v. Borough Coffee, Inc., 588 F.3d 97, 114 (2d Cir.2009); Louis Vuitton Malletier v. Dooney & Bourke, Inc. ("Vuitton II"), 454 F.3d 108, 115 (2d Cir.2006).
. Virgin Enters., 335 F.3d at 146. Accord Starbucks Corp., 588 F.3d at 114 (“To prevail on a trademark infringement and unfair competition claim under [section 32(1) or section 43(a) of the Act], in addition to demonstrating that the plaintiffs mark is protected, the plaintiff must prove that the defendant’s use of the allegedly infringing mark would likely cause confusion as to the origin or sponsorship of the defendant's goods with plaintiff’s goods.”); Vuitton II, 454 F.3d at 115.
. Playtex Prods., Inc. v. Georgia-Pacific Corp., 390 F.3d 158, 161 (2d Cir.2004) (quoting Cadbury Beverages, Inc. v. Cott Corp., 73 F.3d 474, 477-78 (2d Cir.1996)). Accord Chambers v. Time Warner, Inc., 282 F.3d 147, 155 (2d Cir.2002) ("Where there is a claim of consumer confusion [as] to the association of a product or service with another person's trademark, the central inquiry is whether it is likely that 'an appreciable number of ordinarily prudent purchasers' will be misled as to the source or sponsorship of the product or service in question.” (quoting EMI Catalogue P’ship v. Hill, Holliday, Connors, Cosmopulos, Inc., 228 F.3d 56, 61-62 (2d Cir.2000))).
. Playtex Prods., 390 F.3d at 161 (quoting Nora Beverages, Inc. v. Perrier Group of Am., Inc., 269 F.3d 114, 121 (2d Cir.2001)).
. Vuitton II, 454 F.3d at 115 (citing Playtex Prods., 390 F.3d at 161).
. Wal-Mart Stores, Inc. v. Samara Bros. ("Samara I"), 529 U.S. 205, 209, 120 S.Ct. 1339, 146 L.Ed.2d 182 (2000).
. Nora Beverages, 269 F.3d at 118.
. Yurman Design, Inc. v. PAJ, Inc., 262 F.3d 101, 117 (2d Cir.2001).
. Landscape Forms, Inc. v. Columbia Cascade Co., 113 F.3d 373, 381 (2d Cir.1997).
. See Samara I, 529 U.S. at 211, 120 S.Ct. 1339. It is important to note that while individual elements of a trade dress may be unprotectable when viewed in isolation, "it is the combination of elements that should be the focus of the distinctiveness inquiry. Thus, the overall dress [may be] distinctive despite its incorporation of [unprotectable] elements." Jeffrey Milstein, Inc. v. Greger, Lawlor, Roth, Inc., 58 F.3d 27, 32 (2d Cir.1995).
. 15 U.S.C. § 1125(a)(3).
. See id. § 1125(a)(1)(A).
. See 287 F.2d 492, 495 (2d Cir.1961); see also Starbucks Corp., 588 F.3d at 115.
. See Polaroid, 287 F.2d at 495.
. Starbucks Corp., 588 F.3d at 115 (quoting Star Indus., Inc. v. Bacardi & Co. Ltd., 412 F.3d 373, 384 (2d Cir.2005)).
. Brennan’s, Inc. v. Brennan’s Restaurant, LLC, 360 F.3d 125, 130 (2d Cir.2004) (citing Polaroid, 287 F.2d at 495).
. Id. (quoting Lois Sportswear, U.S.A., Inc. v. Levi Strauss & Co. ("Lois II”), 799 F.2d 867, 872 (2d Cir.1986)).
. Bourjaily v. United States, 483 U.S. 171, 175-76, 107 S.Ct. 2775, 97 L.Ed.2d 144 (1987) (discussing Rule 104(a) of the Federal Rules of Evidence). Accord Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 592 & n. 10, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993) (citing Bourjaily, 483 U.S. at 175-76, 107 S.Ct. 2775, and explaining that the proponent of expert testimony must prove admissibility by a preponderance of proof).
. 509 U.S. at 597, 113 S.Ct. 2786. Accord Kumho Tire Co. v. Carmichael, 526 U.S. 137, 147-49, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999).
. Bickerstaff v. Vassar Coll., 196 F.3d 435, 449 (2d Cir.1999) (quoting Hollander v. American Cyanamid Co., 172 F.3d 192, 202 (2d Cir.1999)). Accord Louis Vuitton Malletier v. Dooney & Bourke, Inc. (“Vuitton TV”), 525 F.Supp.2d 558, 561-65 (S.D.N.Y.2007) (discussing district court’s "special obligation" to gatekeep with respect to expert evidence).
Additionally, expert testimony may not usurp the role of the court in determining the applicable law. See United States v. Lumpkin, 192 F.3d 280, 289 (2d Cir.1999). Although an expert "may opine on an issue of fact,” an expert "may not give testimony stating ultimate legal conclusions based on those facts.” United States v. Bilzerian, 926 F.2d 1285, 1294 (2d Cir.1991). Expert testimony is also inadmissible when it addresses “lay matters which [the trier of fact] is capable of understanding and deciding without the expert’s help.” Andrews v. Metro N. Commuter R.R. Co., 882 F.2d 705, 708 (2d Cir.1989).
. See Daubert, 509 U.S. at 595, 113 S.Ct. 2786.
. Vuitton IV, 525 F.Supp.2d at 562 (citation and quotation marks omitted).
. Id., (quotation marks omitted).
. Virgin Enters., 335 F.3d at 151.
. Lois II, 799 F.2d at 875.
. See Schering Corp. v. Pfizer Inc., 189 F.3d 218, 225-28 (2d Cir.1999).
. 6 McCarthy on Trademarks § 32:184 at 32-392. As McCarthy explains, "[d]irect evidence of actual confusion can come only from such sources as misdirected phone calls or letters or even from that rarest of evidence, the testimony of someone willing to testify that they were once a confused customer.” Id. Although survey evidence is not direct evidence of actual confusion, it is nonetheless routinely categorized “under the heading of 'actual confusion.’ ” Id. at 32-393.
. Id. § 32:178 at 32-380.
. Id. at 32-380 to 32-381.
. Louis Vuitton Malletier v. Dooney & Bourke, Inc. ("Vuitton I”), 340 F.Supp.2d 415, 433 (S.D.N.Y.2004) (citation and alterations omitted), vacated on other grounds by Vuitton II, 454 F.3d at 117. See also Manual for Complex Litigation § 11.493 at 103 (Federal Judicial Center 4th ed. 2004) (setting out seven criteria); Shari Seidman Diamond, Reference Guide on Survey Research, in Reference Manual on Scientific Evidence (“Diamond on Survey Research”) at 359, 373-418 (Federal Judicial Center 3d ed. 2011) (discussing criteria to be considered to determine the admissibility of and weight to be accorded to survey evidence).
. 6 McCarthy on Trademarks § 32:163 at 32-333.
. See Trouble v. Wet Seal, 179 F.Supp.2d 291, 308 (S.D.N.Y.2001) ("Although no survey can construct a perfect replica of 'real world' buying patterns, a survey must use a stimulus that, at a minimum, tests for confusion by roughly simulating marketplace conditions.”); see also American Footwear Corp. v. General Footwear Co. Ltd., 609 F.2d 655, 660 n. 4 (2d Cir.1979) (holding district court decision to exclude survey "for failure to conduct it under actual marketing conditions” not clearly erroneous). But cf. Vista Food Exch., Inc. v. Vistor Corp., No. 03-CV-5203, 2005 WL 2371958 at *5-7 (E.D.N.Y. Sept. 27, 2005) (noting that failure to approximate actual marketplace conditions is only one factor amongst many to consider when determining what weight, if any, to give to a survey).
. See Schering, 189 F.3d at 228; Starter Corp. v. Converse, Inc., 170 F.3d 286, 296-98 (2d Cir.1999) (affirming district court's exclusion of survey where any probative value was outweighed by prejudicial effect); Vuitton IV, 525 F.Supp.2d at 568 (adopting Special Masters’ recommendation to exclude flawed survey under Rules 702 and 403); 6 McCarthy on Trademarks § 32:170 at 32-351 to 32-352 ("In an extreme case, an improperly conducted survey with slanted questions or serious methodological defects may be excludable as 'irrelevant' of the true state of mind of potential purchasers.... [However, t]he majority rule is that while technical deficiencies can reduce a survey’s weight, they will not prevent the survey from being admitted into evidence.”).
. See Tr. at 14:5-10.
. Guess also moved to exclude the Rappeport Survey, which Gucci produced in rebuttal to the Helfgott Surveys. Because the Helfgott Surveys are excluded for the reasons given below, the admissibility of the Rappeport Survey is moot.
. Dilution exists if an "association arising from the similarity between a mark ... and a famous mark” either "impairs the distinctiveness of the famous mark” or "harms the reputation of the famous mark.” 15 U.S.C. § 1125(c)(2). As a threshold matter, Guess argues that the Mazis Survey should be excluded as irrelevant to any dilution issues because it only measures association. See Guess Mem. at 21-24. I disagree. A survey that determines whether and to what degree association exists is plainly relevant to the issue of whether there is association that impairs or harms a famous mark. Gucci may well fail to show impairment or harm, but that does not affect the admissibility of the Mazis Survey on the issue of association.
. See Guess Mem. at 20, 24.
. See id.
. See Jacob Jacoby, Experimental Design and the Selection of Controls in Trademark and Deceptive Advertising Surveys, 92 Trademark Reporter 890, 935-37 (2002).
. Michael Rappeport, Litigation Surveys— Social “Science” as Evidence, 92 Trademark Rep. 957, 987 (2002).
. 32 F.Supp.2d 561 (E.D.N.Y.1999).
. See id. at 574-75.
. See Fiji Water Co., LLC v. Fiji Mineral Water USA, LLC, 741 F.Supp.2d 1165 (C.D.Cal.2010).
. Guess also argues that by rotating the Quattro G pattern 45 degrees, the Gucci Surveys improperly increased the confusion level. See Guess’s Reply Memorandum of Law in Further Support of Its Motion in Limine to Exclude the Surveys of Dr. Michael Rappeport, George Mantis, and Dr. Michael B. Mazis (“Guess Rep. Mem.”) at 8. According to Dr. Diamond, this change "apparently had the effect of causing the Quattro G design ... to appear to some respondents as showing inter*741locking C's, rather than G's.” Diamond 1 at 8. Because the other differences between the control bag and the test bag are potential confounding factors, however, it is impossible to draw a firm causal link between the rotation of the Quattro G symbol and the increase in "C" brand responses without an additional survey. Furthermore, Gucci correctly argues that rotating the Quattro G symbol was necessary "to remove any semblance ... of the diamond pattern.” Gucci's Memorandum of Law in Opposition to Guess’s Motion in Li-mine to Exclude the Surveys of Dr. Michael Rappeport, George Mantis, and Dr. Michael B. Mazis ("Gucci Opp. Mem.”) at 12. Accordingly, I find no reason to hold that the rotation of the Quattro G symbol was improper.
. See Guess Mem. at 17-18, 24-25. The only case law that Guess cites to support this argument is my decision in Vuitton TV, 525 F.Supp.2d at 605. However, because that case involved a trademark, and not a multicomponent trade dress, I find that it is inapplicable. See id. at 592.
. See Guess Mem. at 24-25.
. See Jerre B. Swann, Likelihood of Confusion Surveys and the Straitened Scope of Squirt, 98 Trademark Rep. 739, 741 n. 13 (2008).
. See Diamond Aff. at 4.
. Guess also argues that the Mantis Survey failed to define an appropriate population or use an adequate sampling method. See Diamond Aff. at 9. Because I find the issue of failure to reflect actual marketplace conditions to be dispositive, I do not reach that argument.
. Gucci Opp. Mem. at 8.
. See id.
. Diamond Aff. at 5.
. 631 F.Supp. 735 (S.D.N.Y.1985) (“Lois I”).
. See Gucci Opp. Mem. at 9 n. 12. Guess does not dispute Gucci’s characterization of this case. The Second Circuit opinion reviewing that case noted, however, that the district court "discounted [the survey] due to methodological defects in simulating the post-sale environment.” Lois II, 799 F.2d at 875. While the district court technically decided to give the survey at issue little or no weight, the distinction between that action and excluding the survey, especially when parties separate their motions to exclude and their motions for summary judgment, is largely academic.
. Civil No. 01-1582-RE, 2003 WL 25710435 (D.Or. Jan. 29, 2003).
. 529 F.Supp.2d 1215 (D.Or.2007).
. See Guess Rep. Mem. at 5.
. See adidas-Salomon, 2003 WL 25710435, at *8. See also adidas America, 529 F.Supp.2d at 1227.
. Guess Mem. at 16.
. Diamond Aff. at 6.
. Guess Mem. at 17.
. No. 04-cv-7203, 2006 WL 1012939 (S.D.N.Y. Apr. 19, 2006).
. 690 F.Supp.2d 218 (S.D.N.Y.2010).
. See 609 F.2d at 660 n. 4.
. See 36 Fed.Appx. 12, 14-15 (2d Cir.2002).
. See THOIP, 690 F.Supp.2d at 240.
. See id,
. See id. at 241.
. See Juicy Couture, 2006 WL 1012939, at *25-26 (noting that the survey in question was “so fundamentally flawed [ ] that it would have been excluded from evidence at a jury trial”).
. Gucci argues that Guess’s cited cases stand for the proposition that deviations from actual marketplace conditions are problematic if and only if they are designed to increase the level of confusion. According to Gucci, removing the center stripe pattern on the Mantis test bag is not problematic because it had the effect of reducing the level of confusion. See Gucci Opp. Mem. at 10. There are two problems with this argument. First, Gucci provides no evidence that altering the center stripe pattern decreased the level of confusion. Second, even if Gucci had provided such evidence, it provides no case law to support the notion that departures from actual marketplace conditions are ever acceptable. Accordingly, I reject Gucci's argument here.
. Gucci Opp. Mem. at 12-13.
. Gucci claims that it was forced to use an atypical test bag because Guess failed to provide it with a more typical one. See Gucci Opp. Mem. at 13 n. 18. Gucci had at least one bag — the “Basique Bowler” used in the Mazis Survey — that would have withstood Guess’s attack on the atypicality of the Mantis test bag. Accordingly, Gucci’s claim is rejected.
. See Gucci Mem. at 13-16. Gucci originally asked the court to exclude only the Osaka Survey. As noted above, however, it now asks that I exclude all three of the Helfgott Surveys.
. See Guess’s Memorandum of Law in Opposition to Gucci America, Inc.'s Motion to Exclude the Expert Opinions, Testimony, and Surveys of Dr. Myron J. Helfgott and Dr. Carol A. Scott ("Guess Opp. Mem.”) at 10-14.
. See Simonson Report at 7-13.
. Guess Opp. Mem. at 5. The evidence about how women carry their handbags in post-sale scenarios comes from a sur-rebuttal study conducted by Dr. Scott in response to the Rappeport Survey. See Declaration of Carol A. Scott, PhD, Ex. 8 to Welsh Decl. ("Scott Study”).
. See Guess Opp. Mem. at 2-4.
. Id. at 3 (emphasis in original).
. See Tr. at 20:2-14.
. See Expert Report of Professor Carol A. Scott ("Scott Report”), Ex. B to Gucci’s Notice of Motion, at 6.
. See 6 McCarthy on Trademarks § 31:1.
. See, e.g., U.S. Bank N.A. v. Ables & Hall Builders, 582 F.Supp.2d 605, 611 (S.D.N.Y.2008).
. Patsy's Brand, Inc. v. I.O.B. Realty, Inc., 317 F.3d 209, 217 (2d Cir.2003).
. See Black Diamond Sportswear, Inc. v. Black Diamond Equip., Ltd., - Fed.Appx. -, -, No. 06-3508-cv, 2007 WL 2914452, at *1 (2d Cir. Oct. 5, 2007).
. See Scott Report at 3.
. Simonson Report at 17.
. 800 F.Supp.2d 515 (S.D.N.Y.2011).
. See id. at 534-35.
. Jacoby, supra note 91 at 932 n. 76. Accord Diamond on Survey Research at 399.
. Gucci Rep. Mem. at 10 n. 15.
. See Gucci Mem. at 18.
. See Guess Opp. Mem. at 22.
. See id. at 16.
. See Gucci Rep. Mem. at 8.
. See Rappeport, supra note 92 at 966 n. 23.
. See Vincent N. Palladino, Assessing Trademark Significance: Genericness, Secondary Meaning and Surveys, 92 Trademark Reporter 857, 870 (2002).
. See Scott Report at 6.