RULING ON MOTION FOR SUMMARY JUDGMENT
DORSEY, District Judge.Plaintiff alleges disparate impact under 42 U.S.C. § 2000e-5. Defendants move for summary judgment.
I. Background
The facts are fully stated in Gomes v. Avco Corp., 964 F.2d 1330 (2d Cir.1992), familiarity with which is assumed.
Plaintiff claims that Avco “employed a neutral practice ... that disproportionately excluded Portuguese from skilled machinist positions at Avco.” Id. at 1334. That practice is an “eight-year rule,” which requires that an employee have eight years practical experience to become eligible for promotion to a skilled trade position without serving an apprenticeship.
II. Discussion
A. Summary Judgment Standard
Summary judgment will be granted only if there is no genuine issue as to any material *133fact and the moving party is entitled to judgment as a matter of law. Fed.R.Civ.P. 56(c). The initial burden is on the moving party to demonstrate that there are no material issues of fact in dispute. Thompson v. Gjivoje, 896 F.2d 716, 720 (2d Cir.1990). “All reasonable inferences and any ambiguities are drawn in favor of the non-moving party.” Id.
B. Prima Facie Case
Defendants argue that plaintiff demonstrates no genuine issue of material fact establishing a prima facie case of disparate impact. Defendants do not here argue that if a prima facie ease is established, the eight-year rule nonetheless has a “manifest relationship to the employment in question.” See Griggs v. Duke Power Co., 401 U.S. 424, 432, 91 S.Ct. 849, 854, 28 L.Ed.2d 158 (1971).
To establish a prima facie case of disparate impact, plaintiff must show (1) that there is a significant disparity between the racial composition of the qualified persons in the labor market and the persons holding at-issue jobs, and (2) that the challenged employment practice caused the statistical disparity. Wards Cove Packing Co. v. Atonio, 490 U.S. 642, 650-56, 109 S.Ct. 2115, 2121-24, 104 L.Ed.2d 733 (1989); E.E.O.C. v. Joint Apprenticeship Comm., 895 F.2d 86, 90 (2d Cir.1990). The focus is on the impact of a particular hiring practice. Wards Cove, 490 U.S. at 656, 109 S.Ct. at 2124.
Defendants argue (1) that plaintiffs statistical showing is insufficient to establish the existence of a disparity between qualified persons in the labor market and the persons holding at-issue jobs at Avco, and (2) that plaintiff has failed to raise a genuine issue of material fact as to whether the challenged employment practice caused the alleged disparity.
1. Statistical Shoioing of a Disparity
“[Statistical proof can alone make out a prima facie case.” Id. at 650, 109 S.Ct. at 2121. Generally, comparison betwéen the racial composition of the qualified persons in file labor market and the persons holding at-issue jobs is the proper basis for the initial inquiry. Id. at 650-51, 109 S.Ct. at 2121. “Alternatively, in eases where such labor market statistics will be difficult if not impossible to ascertain, ... certain other statistics — such as measures indicating the racial composition of ‘otherwise-qualified applicants’ for at-issue jobs — are equally probative for this purpose.” Id. at 651, 109 S.Ct. at 2121. Figures for the general population, if accurately reflective of the pool of qualified job applicants, may sustain a prima facie case. Id. at 651 n. 6, 109 S.Ct. at 2121 n. 6.
The science of statistical analysis encompasses more than the mere notation of directly observed phenomena. Necessity often dictates that the composition of a given population be estimated by projecting data gathered by less than optimal means from only a sample of that population. If the sample is adequate, the data gathering techniques reliable, and the conclusions drawn demonstrated to be statistically significant, such estimates and projections may properly be admitted into evidence.
[Statistical evidence by its very nature deals with probabilities rather than certainties. All that can be required of methods employed in gathering such evidence is that they assure reasonably accurate findings. Absolute perfection usually is not attainable in this kind of endeavor.
Vulcan Society v. Civil Service Commission, 360 F.Supp. 1265, 1270 (S.D.N.Y.), aff'd, 490 F.2d 387 (2d Cir.1973) (footnote omitted). Not being generally obtainable, “absolute certainty” is not required.
Guardians Ass’n v. Civil Serv. Comm’n, 633 F.2d 232, 240-41 (2d Cir.1980), aff'd, 463 U.S. 582, 103 S.Ct. 3221, 77 L.Ed.2d 866 (1983), cert. denied, 463 U.S. 1228, 103 S.Ct. 3568, 77 L.Ed.2d 1410 (1983).1
Plaintiff argues that the proportion of Portuguese in the qualified labor market is significantly greater than among the skilled ma*134chinists at Avco. Plaintiff concludes that four percent of the qualified labor market is Portuguese, while only .6 percent of the skilled machinists at Avco are Portuguese. Defendants do not dispute that these numbers would constitute a statistically significant disparity. Plaintiffs statistics are based on a comparison of the Avco skilled machinist work force with the machinist work force of four large manufacturing employers in the area, which plaintiff argues is a sufficiently accurate proxy for the qualified labor market.
Defendants mount several challenges to plaintiffs statistics.
a. Identification of the Qualified Labor Pool
To determine the qualified labor pool, plaintiff subpoenaed the names of employees in machining classifications at seven companies in the Bridgeport-Stratford area, where Avco is located. Six of these companies complied. From four of those companies, the “qualified labor pool” for statistical comparison was drawn.2
Defendants have two objections to this method. First, Avco argues that plaintiffs sample is too geographically limited to represent accurately the qualified labor pool in the area. Avco argues that its work force comes from many towns outside the immediate metropolitan area, and thus the qualified labor pool should extend to include that within commuting distance. It cites census figures and argues that the qualified labor pool actually includes as many as 93,501, the total of those in Fairfield and New Haven counties working in “precision production, craft and repair.” Though there is some overstatement, Avco’s argument has some merit; the more plaintiffs sample diverges from the actual number of people in the qualified labor pool, the less reliable plaintiffs analysis will be.
Plaintiff, however, faced particularly unusual difficulties in determining the appropriate “qualified labor pool.” According to plaintiff, the EEO-1 reporting forms by which employers report the ethnic and racial background of its work force do not identify an employee as Portuguese. Neither the U.S. Department of Commerce, the state Department of Labor nor the Equal Employment Opportunity Commission maintain statistics on the number of Portuguese by geographic area. Thus, determining an appropriate comparison group is a matter of some difficulty for plaintiff. “[Necessity often dictates that the composition of a given population be estimated by projecting data gathered by less than optimal means from only a sample of that population.” Guardians, 633 F.2d at 240. There is “no justification for holding [plaintiff] to an unrealistic standard regarding the ‘completeness’ of [his] statistical showing.” Id. at n. 13. However, the sample must nonetheless be adequate, the data gathering technique must be reliable, and the methods must assure reasonably accurate findings. Id. at 240-41.
Plaintiff contends that his sample is a reasonable proxy for the actual qualified labor pool, which is impossible to determine with certainty. Such “sampling” is not necessarily unreasonable where it might reasonably be found to represent accurately the larger population. Statistics on racial composition of the general population are frequently based on scientific sampling techniques; these statistics may nonetheless be used to show disparate impact in certain cases. See Wards Cove, 490 U.S. at 651 n. 6, 109 S.Ct. at 2121 n. 6. It cannot now be held that plaintiffs sample fails, as a matter of law, to show a disparity based on Portuguese national origin.
Defendants’ second objection is more troubling: that plaintiffs use of data from only four of the six companies that responded skews the result. Plaintiff responds that the statistics from the other two companies “were incapable of comparison to Avco’s eight year rule based upon the explanation of ‘skilled’ which was received from these companies.” Plaintiffs Opposition at 20, n. 5. Defendants contend further that these statistics were eliminated precisely because the two companies had practices that were equivalent to Avco’s eight-year rule, thus undermining plaintiffs causation argument, ad*135dressed below. Plaintiff does not otherwise justify the exclusion of the two companies as improving the representativeness of the sample.
The involvement of plaintiffs counsel, who neither claims to be nor was disclosed as an expert, in failing to include available data from the proposed sample seriously undermines the probative value of the result. The weight of statistical evidence depends upon “the existence of proper supportive facts and the absence of variables which would undermine the reasonableness of the inference of discrimination.” United States v. Ironworkers Local 86, 443 F.2d 544, 551 (9th Cir.), cert. denied, 404 U.S. 984, 92 S.Ct. 447, 30 L.Ed.2d 367 (1971). In Geller v. Markham, 635 F.2d 1027, 1033 (2d Cir.1980), cert. denied, 451 U.S. 945, 101 S.Ct. 2028, 68 L.Ed.2d 332 (1981), uncorroborated statistics prepared by a person with no expertise as a statistician and who had a direct interest in the outcome of the ease were “so defective as to justify the district court’s refusal to give them any weight.” Id. at 1033. Here, although plaintiffs statistics do not appear to be as deficient as those in Getter, the role of plaintiffs counsel in paring the initial data, which itself questionably represents the qualified labof pool, renders the data presented even more subject to question.3
That plaintiffs evidence can be questioned does not justify summary judgment; defendants must show that the statistical evidence is so lacking in probative value that no reasonable jury could hold for plaintiff. See Anderson v. Liberty Lobby, Inc., 477 U.S. 242, 248, 106 S.Ct. 2505, 2510, 91 L.Ed.2d 202 (1986).
b. Identification of the Relevant Class of Workers
Defendants argue that by eliminating from the qualified labor pool all those skilled workers who were not machinists and by eliminating certain categories of machinist as well, plaintiff greatly reduced the number included in the group. Because the eight-year rule applied to all skilled positions, defendant argues, the comparison group should have included all skilled workers in the relevant labor market, not just machinists. At the very least, defendant argues plaintiff should have included all categories of machinist.
Plaintiff responds that including all skilled workers would have rendered his analysis subject to attack for overbreadth, in that the “jobs at issue” are skilled machinist positions. See, e.g., Mazus v. Department of Transp., 629 F.2d 870, 875 (3d Cir.1980), cert. denied, 449 U.S. 1126, 101 S.Ct. 945, 67 L.Ed.2d 113 (1981). The first issue is the level of specificity at which the “job at issue” should be defined. One extreme would be to apply it to all jobs subject .to the eight-year rule. The other would be to apply it only to the exact positions sought by plaintiff, i.e., a jig bore.operator or boring mill operator for a. specific type of jig bore or boring mill.
It cannot be held, as a matter of law, that the “job at issue” must include all positions subject to the challenged employment practice. Inclusion of all' such jobs may infect the analysis with factors that pertain to certain positions. The applicant pool for skilled maintenance personnel, for example, may differ from that for skilled machinists in ways unrelated to the eight-year rule. Inclusion of widely divergent job categories in the analysis does not necessarily lead to- a more accurate result. Of course, using only a smaller group .detracts from the analysis as well because statistical conclusions from smaller samples are less reliable. Determination of the “job at issue” is best made on a case by case basis. It cannot be said, as a matter of law, at this stage of the proceedings that plaintiffs use of skilled machinists as the reference group is invalid.
Defendants object further to the role of plaintiff’s counsel in defining the base group. It would have been more prudent to leave to an expert the job of editing the data. Plaintiffs counsel must inevitably ’have been involved in determining the analysis required *136by the law. Defendants’ arguments that the definition of the “job at issue” is a matter of law and that plaintiffs counsel should not have participated in its determination contradict one another.
To defendants’ claim that plaintiff improperly excluded categories of skilled machinist from its analysis, plaintiff responds that those omissions were inadvertent and submits a reanalysis, after inclusion of those categories, which still produced significant results. Although this does not reflect appropriate care in plaintiffs initial analysis, the re-analysis rectified the error claimed by defendants.
c.Aggregation of Data
Defendants argue that plaintiffs analysis is further undermined by the aggregation of the data from each of the four companies into one comparison group. Statistical evidence should not be based on “such dubious analytical techniques as obtaining an overall percentage figure by averaging annual percentages for several different years, thus contravening the basic, well-recognized principle that such averaging by percentages produces meaningless and misleading results.” Getter, 685 F.2d at 1033. Plaintiff did not average percentages, but he aggregated groups of workers into one large group to serve as the basis of the statistical analysis. Defendants have not shown how this technique rendered the result invalid.
d.Determination of Portuguese National Origin
Defendants argue that plaintiffs method of determining workers’ national origin further undermines the analysis. A Portuguese linguist was used to determine, from surnames, Portuguese national origin. The linguist counted the Portuguese surnames on the worker lists.
Whether the linguist is qualified as an expert in this regard is a disputed question of fact. Even a qualified linguist’s identification of people of Portuguese national origin by their surnames alone is subject to question. A Portuguese could have a non-Portuguese surname because of marriage or because of a maternal Portuguese ancestry. A Portuguese surname is not conclusive of being Portuguese, again because of marriage or because one’s nation of origin might previously have been governed by Portugal, such as Brazil.
Clearly, an absolutely accurate count of Portuguese workers cannot be obtained by examining surnames alone. Nonetheless, as plaintiff argues, it cannot be said that no correlation exists between the number of Portuguese workers and the number of workers with Portuguese surnames. Therefore, the latter is some evidence of a group which may serve as a proxy for the former for purposes of comparing the skilled machinists at Avco with those in the qualified labor pool. The strength or weakness of the correlation and its effect upon the plaintiffs statistical conclusions are factual issues to be determined at trial.4
e.Summary
In summary, plaintiffs statistical analysis has three notable weaknesses. First, a sample of skilled machinists in the area was used for the actual qualified labor pool. Second, two companies’ workers were eliminated from the sample without justification, possibly skewing the results and reducing the sample, making it less conclusive. Finally, determining Portuguese national origin solely on surnames is potentially underinclusive and overinclusive. In light of the difficulties plaintiff faced in proving a Title VII disparate impact action based on Portuguese national origin, it cannot be said that no reasonable jury could hold for plaintiff as to the existence of a disparity between the proportion of Portuguese in the skilled machinist population at Avco and that in the qualified labor pool. See Anderson, 477 U.S. at 248, 106 S.Ct. at 2510. Frailty in plaintiffs figures does not render them legally *137invalid. Summary judgment on this basis is therefore denied.
2. Causation
Plaintiffs statistical evidence does not prove that the eight-year rule caused the disparity. Plaintiff concedes that his statistical expert cannot opine as to the precise cause of the disparity. His argument is that the four companies from which the comparison sample was drawn do not have a similar eight-year rule and, therefore, it must be inferred, for purposes of this motion, that the eight-year rule caused the disparity.
To hold that plaintiff may show causation merely by showing that the qualified work force is not subject to the challenged employment practice does not account for the myriad of factors which might have played a role in the makeup of that work force. Every firm’s work force results from a multi-faceted decision-making process involving varying people. Plaintiffs evidence is insufficient as it cannot be found to prove that absence of the eight-year rule in the practices of the companies whose work forces were aggregated to form the comparison group is the only significant factor causing the makeup of that ground as distinguished from the makeup of Avco’s relevant work force. There are innumerable distinctions that might be found in the circumstances at Avco and the circumstances at the comparison four firms; under plaintiffs logic, any, and perhaps each, such difference could have been a cause of the statistical disparity between Portuguese and non-Portuguese. If this were so, then the use by Avco of the eight-year rule cannot be said to be probative of the cause of the disparity. This result contradicts not only common sense but also the established focus in Title VII disparate impact cases on the particular employment practice challenged. See Wards Cove, 490 U.S. at 656, 109 S.Ct. at 2124. Therefore, no reasonable inference of causation of the disparity can be drawn solely from plaintiffs statistical evidence.
Other evidence, however, is sufficient to raise a genuine issue of fact as to causation. Specifically, the testimony of Angelo Tramontanis, a former chairman of the Skilled Trades Council of Local 1010, that in its 1970 labor agreement Avco switched from a ten-year rule to an eight-year rule for the very purpose of increasing minority enrollment in the skilled trades suggests that the time in service was a factor in the minority makeup of the work force. See Deposition of Angelo Tramontanis at 28-29, Exhibit 11 to the Affidavit of William Frumkin, filed April 8, 1991. It is not unreasonable to infer that if a ten-year rule was seen as dampening minority enrollment in the skilled trades, an eight-year rule might also have such an effect. On that basis, a jury could reasonably find for the plaintiff on causation. See Anderson, 477 U.S. at 248, 106 S.Ct. at 2510. Summary judgment on this basis is therefore denied.
C. Other Claims
Plaintiff challenges two other Avco employment practices: (1) refusing to accept employment with Avco in a relevant trade as “practical experience” under the eight-year rule, and (2) requiring proof of “practical experience” by “proper affidavits.” Third Amended Complaint at ¶ 27(b) and 27(c).
As to the former, plaintiffs claim appears to be that Avco has refused to count time spent as a non-skilled worker toward the experience requirement. This claim challenges the application of the rule, whereas plaintiffs first claim challenges the existence of the eight-year rule.
To show that this practice is a cause of the disparity, plaintiff offered evidence that thirty-four of the forty-five Portuguese machinists in non-skilled positions at Avco were employed at Avco as of 1982. They can therefore be inferred to have eight years of non-skilled experience. None of these employees have been promoted to the skilled trades. No other evidence has been offered pertaining to the question of why they have not been promoted. This evidence alone is insufficient to establish an inference of causation. Without comparable figures for non-Portuguese or non-minority employees, there is no basis to believe that this employment practice disproportionately affects Portuguese employees. As the eight-year rule is one of eligibility and not one of automatic *138promotion, there may be an equally large proportion of non-Portuguese machinists at Avco with eight or more years of non-skilled experience who have also not been promoted to skilled positions. Since there is no evidence from which to infer causation, summary judgment is granted as to the claim in ¶ 27(b) of the Third Amended Complaint.
Finally, plaintiff has introduced no evidence that Avco’s requirement of proof of practical experience by “proper affidavits” is a cause of the alleged disparity, and has not addressed this claim at all in its opposition. Summary judgment is therefore granted as to the claim in ¶ 27(c) of the Third Amended Complaint.
III. Conclusion
For the foregoing reasons, defendants’ motions for summary judgment (documents # 192 and # 195) are granted in part and denied in part.
SO ORDERED.
. Defendants correctly note that the statistical methodology challenged in Guardians was somewhat stronger than that used here. In Guardians, the data was compiled by independent professionals and similar results were found using three different statistical methods. Guardians, 633 F.2d at 240.
. Plaintiff originally used only three, but the fourth was added.
. Defendants have not argued that plaintiff’s statistics would differ substantially had the two omitted companies been included in the analysis. Cf. Guardians, 633 F.2d at 240 n. 13, quoting Dothard v. Rawlinson, 433 U.S. 321, 331, 97 S.Ct. 2720, 2727-28, 53 L.Ed.2d 786 (1977) (“If the employer discerns fallacies or deficiencies in the data offered by the plaintiff, he is free to adduce countervailing evidence of his own. In this case no such effort was made.").
. A person with a Portuguese surname might have standing to bring a Title VII action even if he or she is not of Portuguese national origin, but rather has taken the surname through marriage or adoption. Cf. Whitney v. Greater New York Corp. of Seventh-Day Adventists, 401 F.Supp. 1363, 1365-66 (S.D.N.Y.1975) (white woman has standing to sue where she was discharged because of her social relationship with a black man); Jones v. United Gas Improvement Co., 68 F.R.D. 1, 7-8 (E.D.Pa.1975) (referring to Spanish-surnamed individuals).