MEMORANDUM AND ORDER
STEARNS, District Judge.
This decision follows a hearing held under the directives of
Markman v. Westview Instruments, Inc.,
517 U.S. 370, 116 S.Ct. 1384, 134 L.Ed.2d 577 (1996).
“Markman
requires a trial judge in a patent case to construe and define the contested claims of a patent. The task com
mitted to the judge is to explain what the protected invention is, and sometimes what it is not, ideally in language that will be accessible to a lay jury.”
Biogen, Inc. v. Amgen, Inc.,
18 F.Supp.2d 105, 106 (D.Mass.1998).
As described by the inventor, Dr. George Pieczenik:
[t]he invention provides an efficient and convenient means for the identification and production of monoclonal antibodies to any specific region of any antigen or hapten of interest. Monoclonal antibody production, according to the invention, does not require antigenic stimulation of a host animal. This is a critical concept of the present invention. Such antigenic stimulation can be employed to increase the frequency for cognate hybridoma formation, but there will be a member of an antibody population (of a sufficiently large number of members) which will recognize the particular epitope even in the absence of such stimulation.
The invention involves the antibody binding properties of a test species, e.g., a peptide, but is totally independent of the ability of the test species to induce an antigenic response in vivo. The invention permits the identification of the specific peptide sequence on a protein that is recognized by an antibody, i.e., the epitope. The specificity of antibodies recognizing distinct sequences, or ep-itopes, on the same antigen can be differentiated. In addition, the invention permits the characterization and the localization on a chromosome of the nucleotide sequence encoding the amino acid sequence recognized by an antibody.
’363 Patent, Col. 5, Ins. 29-50.
The utility of the invention, according to plaintiffs, derives from its “library” of peptide sequences, which allows an “antibody binding specificity to be determined without previous knowledge of antigenic sequences,” and its recognition “that the size of the bindable universe (epitopic) and binding universe (antibody) is limited and thus can be enumerated, recognized and synthesized.” Plaintiffs’ Response, at l.
The invention has practical application in the development of pharmaceutical products like vaccines. The principal patent in dispute, U.S. Patent No. 5,866,363 (the ’363 patent), “Method and Means for Sorting and Identifying Biological Information,” contains two partially disputed independent claims, numbered 24 and 34.
The Disputed Claims
Claim 24 describes a “population of recombinant vectors” containing oligonucleo-tides that encode a population of peptides. It reads as follows:
24. A population of recombinant vectors comprising:
substantially identical autonomously replicating nucleic acid sequences comprising a recombinant structural gene, each structural gene having inserted therein a member of an oligonucleotide population, wherein each member of said oligo-nucleotide population has a coding region having a length from about 4 to about 12 nucleotide triplets that encodes a corresponding peptide sequence of from about .4 to about 12 L-amino acid residues, and wherein the sum of corre
sponding peptide sequences encoded by said oligonucleotide population represents at least about 10% of all possible peptide sequences of said length,
and wherein each member of said oligo-nucleotide population is contained in said recombinant vector population; and
wherein the recombinant structural genes are expressed upon transfer of said recombinant vectors into
Escherichia, coli
host cells, and wherein expression of said recombinant structural genes yields polypeptides, each polypeptide comprising said corresponding peptide sequence.
Claim 34 describes a method of producing the population of peptides described in claim 24. It reads as follows:
34. A method of producing a population of epitopic peptide sequences, comprising of the steps of:
providing a population of recombinant
E. coli
cells, each of said cells containing at least one member of a recombinant vector population, each member of said vector population comprising substantially identical autonomously replicating nucleic acid sequences, said nucleic acid sequences comprising a recombinant structural gene, each structural gene having inserted therein one member of an oligonucleotide population wherein each member of said oligonucleotide population has a length from about 4 to about 12 nucleotide triplets that encodes a corresponding epitopic peptide sequence of from about 4 to about 12 L-amino acid residues, and wherein each member of said oligonucleotide population is contained in said recombinant vector population and wherein the sum of said corresponding epitopic peptide sequences represents at least about 10% of all possible peptide sequences of said length; and
culturing said recombinant
E. coli
cells to allow expression of said recombinant structural genes such that said epitopic peptide sequences are accessible to antibody recognition.
The
Markman
dispute focuses on the proper construction of the following language in claim 24:
wherein each member of said oligonu-cleotide population has a coding region having a length from about 4 to about 12 nucleotide triplets that encodes a corresponding peptide sequence of from about 4 to about 12 L-amino acid residues, and wherein the sum of corresponding peptide sequences encoded by said oligonucleotide population represents at least about 10% of all possible peptide sequences of said length[.]
The parties also dispute the meaning of nearly identical language in claim 34:
wherein each member of said oligonu-cleotide population has a length from about 4 to about 12 nucleotide triplets that encodes a corresponding epitopic peptide sequence of from about 4 to about 12 L-amino acid residues, and wherein each member of said oligonu-cleotide population is contained in said recombinant vector population and wherein the sum of said corresponding epitopic peptide sequences represents at least about 10% of all possible peptide sequences of said length[.]
Finally, the parties disagree over the proper definition of the term “oligonucleo-tide,” as it is used in the claims of the ’363 patent (and in two prior related patents).
Legal Principles
“[C]onstruction of a patent claim is a matter of law exclusively for the court.”
Markman v. Westview Instruments, Inc.,
52 F.3d 967, 977 (Fed.Cir.1995) (citations omitted). “[A]n inventor is not [ordinarily] competent to construe patent claims” because “it is not unusual for
there to be a significant difference between what the inventor thinks his patented invention is and what the ultimate scope of the claims is after allowance by the PTO.”
Solomon v. Kimberly-Clark Corp.,
216 F.3d 1372, 1380 (Fed.Cir.2000). Thus, in construing the claims of the patent, the court must adopt the perspective of a hypothetical practitioner of ordinary skill in the patent art as of the date of the original application.
Wiener v. NEC Electronics, Inc.,
102 F.3d 534, 539 (Fed.Cir.1996), overruled on other grounds by
Cybor Corp. v. FAS Technologies, Inc.,
138 F.3d 1448 (Fed.Cir.1998).
The hierarchy of accepted analytical tools requires a court to begin its analysis with the intrinsic evidence of record. A court should first “look to the words of the claims themselves, both asserted and nonasserted, to define the scope of the patented invention.”
Vitronics Corp. v. Conceptronic, Inc.,
90 F.3d 1576, 1582 (Fed.Cir.1996), citing
Bell Communications Research, Inc. v. Vitalink Communications Corp.,
55 F.3d 615, 620 (Fed.Cir.1995). The court should next look to the patent specification. “The specification contains a written description of the' invention which must be clear and complete enough to enable those of ordinary skill in the art to make it and use it. Thus, the specification is always highly relevant to the claim construction analysis. Usually, it is dispositive; it is the single best guide to the meaning of a disputed term.”
Vitronics,
90 F.3d at 1582. Finally, the prosecution history of the patent may be consulted. “[T]he record before the Patent and Trademark Office is often of critical significance in determining the meaning of the claims,”
Vitronics,
90 F.3d at 1582, but “it too cannot ‘enlarge, diminish, or vary’ the limitations in the claims.”
Markman,
52 F.3d at 980.
The claims, specifications and file history constitute the patent’s “public record ... on which the public is entitled to rely.”
Vitronics,
90 F.3d at 1583. Thus, it is inappropriate for a court to consider extrinsic evidence, such as expert testimony, unless the testimony is necessary to understand the meaning or scope of a technical term in the claims.
Id.,
citing
Pall Corp. v. Micron Separations, Inc.,
66 F.3d 1211, 1216 (Fed.Cir.1995);
Markman,
52 F.3d at 980-981 (same). Expert testimony “may not be used to vary or contradict the claim language....” Nor may it contradict the import of other parts of the specification.
Vitronics,
90 F.3d at 1584 (citation omitted). “[W]here the public record unambiguously describes the scope of the patented invention, reliance on any extrinsic evidence is improper.”
Id.,
at 1583.
Discussion
The Competing Constructions
(a) “from about I to about 12 [nucleotide triplets] [L-amino acid residues]”
Plaintiffs construe this limitation, which is common to both claims, as encompassing lengths of from 3 to 13
random
triplets.
Plaintiffs’ argument focuses on the word “about” and its “clear warning” that exactitude is not being claimed. Plaintiffs’ Response, at 5. Dyax’s counter-construction centers on the consistent use by the paten-tee of the definite integers 4 and 12. “Nowhere in the specification did the patentee say that any integer within the range
should be afforded anything other than its ordinary accustomed meaning. -Indeed, in the specification, when the patentee wished to refer to an amino acid sequence of length 12, he used the number 12; when he wished to refer to a length of 7 amino acids he used the number 7; and when he wished to refer to a 5 amino acid sequence, he used the number 5.” Dyax Brief, at 19. Thus, according to Dyax, “from about 4 to about 12” means from 4 to 12.
(b) “and wherein the sum of [corresponding peptide sequences] [claim M] [said corresponding epitopic peptide sequences] [claim Si] [encoded by said oligonucleotide population] [claim
£47
represents at least about 10% of all possible peptide sequences of said length”
While written slightly differently in the two claims, this limitation refers to the size of the peptide library needed to make the invention work. Plaintiffs offer no consistent construction of what is meant by “about 10% of all possible peptide sequences,” but suggest that “10%” can consist of: (1) 300,000 (or perhaps 30,000) distinct members for any coded library of random peptides with a length in the range of 5 to 13 amino acid residues; (2) 16,000 (or perhaps 1,600) distinct members for any coded library of random peptides with a length of 4 amino acid residues; and (3) 80 (or perhaps 800) distinct members for any coded library of random peptides with a length of 3 amino acid residues.
Plaintiffs’ Brief, at 11, 16. The limitation “all possible peptide sequences of said length” plaintiffs construe to mean “the complete range of possible epitopic peptide sequences ... within the range of 3 to 13 L-amino acid residues consistent with the means by which the ‘oligonucleo-tide population’ was generated.”
Id.,
at 11.
According to Dyax, the 10% limitation requires that the total of the peptide sequences encoded by the oligonucleotide population encompass at least 10% of the possible peptide sequences of a single given length within the range of from 4 to 12 L-amino acids. “All possible peptide sequences of said length,” Dyax construes to mean the number of sequences derived by the formula L = 20 L where L represents the given length within the specified range of L-amino acid residues and 20 signifies the number of genetically encodeable amino acids. Thus, if L is 12, the possible number of sequences is 2012 or 4.096 x 1015, which when divided by 10 yields a library of 4.096 x 10 14 members. Dyax Brief, at 13-14.
Analysis
The parties’ dispute boils down to a basic difference in interpretation that plaintiffs accurately summarize as follows: “Dyax argues that [infringement] should be determined from the perspective of the
size of the peptide library made, whereas plaintiffs’ position is that infringement is determined by the size of the peptide library necessary to bind the desired target.” Plaintiffs’ Response, at 2. Plaintiffs, in other words, maintain that as Dr. Piec-zenik refined his invention, he realized that “five amino acids [the pentapeptide] is a
representative length
of peptide sequences which can bind with differential specificity to an antibody.” Plaintiffs’ Brief, at 14 (emphasis in original). Moreover, “antibodies are
now known
to have specificities which can be competed by peptides in the range of 5-7 amino acids, with a mean in the range of around 5 amino acids.” Plaintiffs’ Response, at 10 (emphasis added). Thus, “the entire universe of antibodies is equivalent to the entire universe of epitopic peptides that are 5 amino acids long on average or 3.2 x 106 possible antibodies.” Plaintiffs’ Brief, at 15. Because “many of the encoded peptides will present sufficiently similar binding surfaces that a single antibody will react with any of them .... it is not necessary to have all, or even most, of the possible coding sequences represented.”
Id.,
at 15 (quoting from File History, at 202). In fact, “all possible antibodies will be found to bind specifically with one of the mixture of random peptides provided a) the peptides are 5-7 amino acid residues in length, and b) the mixture contains at least about 10% of all possible peptide sequences.”
Id.
(quoting File History, at 200). Therefore a library of “about” 300,000 members is all that is required to identify the “universe” of possible antibody binding sites.
Id.
This assertion is the crux of the dispute about the necessary size of the specified library because, as a matter of undisputed scientific fact, there are 20 naturally occurring amino acids. Thus, where the peptide length consists of 5 amino acid residues, the possible number of peptides is 205, or more conventionally stated, 3.2 x 106. Where, however, the length is 12 amino acid residues, the possible number of peptides is 2012 or 4.096 x 1016. It follows that a library containing 10% of all
possible
peptide sequences where the length is 12 would contain 4.096 x 1014 members, as Dr. Pieczenik himself pointed out to the PTO in correcting the examiner’s assumption that the correct formula for calculating the possible number of peptide sequences where L is 12 is the inverse of 2012, or 1220. In traversing the examiner’s rejection, Dr. Pieczenik gave the following example. “For a peptide having a sequence length of 12 (L = 12), each position having an equal probability of being one of the 20 natural amino acids (N = 20), the number of possible sequences is N L = 2012, which can be converted to 4.1 x 1015.” Dyax Brief, at 21 (quoting File History, at 734). He went on to point out that the examiner’s method resulted in a million-fold error on the high side.
Id.
(quoting File History, at 735).
The point is crucial because, as Dyax points out, “the peptides in [its] libraries are longer than 12 amino acids — indeed, some are longer than 60 amino acids. And, Dyax’s phage display libraries include far fewer than 10% of the possible peptide sequences for a selected peptide length.” Dyax Brief, at 9. A library of 300,000 members would represent but 0.0000000073% of the
possible
number of sequences where L is 12, when the formula advocated by Dyax and used by Dr. Pieczenik in his illustration to the PTO is applied.
See
Table, Dyax Reply, at 5. None of the corresponding percentages for lengths 6 to 13, which range from 0.47% (6) to 0.00000000037% (13),
could ever reasonably thought to be “about 10%,” no matter how flexibly the limitation is to be read. It is therefore critical to an understanding of plaintiffs’ position to trace the elements of the argument that the “said” in the phrase “all possible peptide sequences of said length,” refers to pentapeptides.
To the extent that plaintiffs’ argument is based on the actual language of claims 24 and 34, it rests on the supposed difference between the meaning of “selected length” (the term used in the antecedent application) and the term “said length” (the term ultimately chosen). “Whereas
selected
refers to the random length selected
a prion, said
refers to the length of the random peptide sequence that, for example,
binds
to an antibody.”
Plaintiffs’ Response, at 8 (emphasis in original). This semantic change, plaintiffs argue, would have alerted an attentive reader of ordinary skill in the art, familiar with the “scientific presumption” that “antibodies are now known to have specificities which can be competed by peptides in the range of 5-7 amino acids, with a mean in the range of around 5 amino acids,” to the fact that a pentapep-tide library is sufficient to define all peptide sequences with lengths from 6 to 13 amino acid residues.
Id.,
at 10. In other words, a library of 300,000 distinct figures (roughly 10% of 3.2 x 106) would completely satisfy the 10% limitation in the claims. “Said” is a term used by patent drafters who (like many lawyers) are unex-plainably uncomfortable with using the more colloquial “the” when referring back to previously recited claim elements.
See Landis on Mechanics of Patent Claim Drafting
(2001) § 23. Neither claim 24 nor claim 34 makes any antecedent reference to pentapeptides as the sequence defining the “said” length. The element referenced is rather
“a
length from about 4 to about 12 nucleotide triplets,” that is, one of 9 (or 10) designated lengths with its corresponding peptide sequence, Pentapep-tides are certainly one of these lengths, but not the only length referenced. The claims language, in other words, simply will not support the load bearing weight plaintiffs attempt to assign to the word “said.”
Plaintiffs’ prosecution file history and prior art arguments fare no better. Much emphasis is placed on the qualified disclosure in the original 1985 application that
the size of the antibody recognition site corresponds to a peptide sequence in the range of between about 4 and about 12 amino acid residues ... [and that] there are about three million (205) different possible sequences of the twenty amino acid residues taken five at a time and about sixty million if the amino acid residues are taken six at a time. This finite number of peptide sequences may represent the full range of possible antibody recognition sites. Production and maintenance of a representative sample of the peptide sequences of the appropriate length provides the means (1) to screen any antibody of interest in order to determine the precise peptide sequence it binds to ....
Plaintiffs’ Brief, at 12 (quoting File History, at 14-15).
From this, plaintiffs deduce that it would have been “clear” to one skilled in the art that the inventor had
“recognized that random pentapeptides can adequately represent any random 12 amino acid sequence in terms of competitive binding to antibodies.”
Id.,
at 12-13. Why this is so is not explained in any meaningful way, other than by random citations to the discussion of the prior art in the original 1985 patent application, which when read in context, offer no support for plaintiffs’ late blooming theory that the ’363 patent teaches a universe of antibody binding sites bounded by pentapeptides. The citation to Geyson,
et al.,
in the file history is a good example. It is clear in context that Geyson was cited to explain to the PTO why degeneracy (the phenomenon by which an antibody may recognize more than one peptide sequence) made it possible to construct a working population consisting of'only 10% of the peptides of a given length rather than, as the examiner thought would be necessary, the entire peptide population associated with that length. It does not follow from the discussion of Geyson (or Dame,
et al.,
the other principal prior art source cited) that the “prosecution file history make[s] clear to one skilled in the art that any coded library of random peptides with [a] length in the range of 5-13 amino acid residues and containing at least about 300,-000 (e.g. 30,000 = 1%) distinct members is understood to mean an oligonucleotides population that
represents at least about 10% of all possible peptide sequences of said length.”
Plaintiffs’ Brief, at 15-16 (emphasis in original).
Conclusion
The limitation establishing a library of peptide sequences representing “at least about 10% of all possible peptide sequences” of “from about 4 to about 12 L-amino acid residues” has one definite term — “all possible” — and two indefinite terms — “at least about 10%” and “from about 4 to about 12.” There is no indication in the patent specification that Dr. Pieczenik intended these phrases to convey any meaning other than their ordinary English connotation. Thus, “all possible” can only be understood to mean the universe of peptide sequences associated with L-amino acid lengths of “from about 4 to about 12.” While I agree with plaintiffs that the term “about” is a term of deliberate imprecision that might fairly capture the integers 3 and 13 at the boundaries of “from about 4 to about 12,” the term “all possible” can only mean in context the entire universe of what could occur, that is, the total number of naturally occurring sequences that can possibly be associated with the selected length, whether 203 or 2013 or some other specified length within the asserted range of 3 to 13 amino acid residues. In similar fashion, in the interest of lexicographic consistency, “at least about 10%” can be understood to perhaps capture 9%, or given the qualification of “at least about 10%,” perhaps a number substantially above 10%, but certainly not 1%, as plaintiffs’ expert, Dr. Makowski, maintains.
As Dyax points out, plaintiffs’ redefinition of the universe of antibody binding diversity as corresponding with a population of pentapeptide sequences “reads out” of the claims the range of peptides of from 6 to 12 amino acids in length “by making them synonymous with the 5 amino acid member of the range.” Dyax Brief, at 27. Like Dyax, I am puzzled why, if the point of the invention was to provide a population of peptide sequences representing the “universe of possible antibody binding sites,” the claims would have been written “to specify lengths of peptides that admittedly cannot do so,” or why it is not simply made clear that pentapeptide sequences define the intended universe. Dyax Response, at 8. Indeed, there is nothing said at all in the claims (or the specification) about this universe, nor is any meaningful suggestion made that longer peptides can be expressed as representative lengths of pentapeptides. Like Dyax, I can only conclude that plaintiffs’ “pentapeptide universe” theory is an attempt to expand on the claims of the patent to broaden their coverage for purposes of this litigation.
Oligonucleotide
With respect to the ’535 and ’266 patents, plaintiffs indicate agreement with the construction advanced by Dyax with the exception of the meaning of the limitation “oligonucleotide.” Plaintiffs’ Response, at 12-13. Plaintiffs maintain that “oligonucleotide” as used in these two patents would be understood by a person of ordinary skill in the art “to mean a polymor [sic] of nucleotides comprising at least a few nucleotides in length and not usually more than about 100,” although they insist that the term is given a “broader” meaning in the ’363 patent. Plaintiffs’ Brief, at 18, 19. According to plaintiffs, the file history of the ’363 patent “makes clear” that the term “oligonucleotide” as used in that patent signifies an oligonucleotide with “an upper limit at about 600 to about 750 nucleotides triplets in length.”
Id.,
at 8. This assertion is apparently based on a reference by Dr. Pieczenik in the prosecution file history to an oligonucleotide containing 50 tandem sequences of from about 4 to about 12 nucleic acid triplets (hence 50 x 12 = 600).
Id.
I find no support in the patent for plaintiffs’ narrow or broad definition of oligonu-cleotide. An oligonucleotide is defined in scientific and medical texts as a compound created by the condensation of a small number of nucleotides with 20 specified as the upper limit.
See, e.g., Stedman’s Medical Dictionary
(26th ed.1995) 1244. As for the idea that the upper limit might be as high as 600 or 750 triplets based on Dr. Pieczenik’s stray remark, neither claim 24 nor claim 34 makes any reference to an oligonucleotide made up of tandem sequences.
See Markman,
52 F.3d at 980 (“Although the prosecution history can and should be used to understand the language used in the claims, it too cannot ‘enlarge, diminish, or vary’ the limitations in the claims”).
ORDER
For the foregoing reasons, the court for
Markman
purposes will construe the disputed terms as follows. The limitation
“from about 4 to about 12 nucleotide triplets,” as used in claims 24 and 34 of the ’363 patent, is sufficiently indefinite to include a range whose boundaries are delimited by 3 and 13. Similarly, the limitation “from about 4 to about 12 L-amino acid residues” means a range of from 3 to 13 of such residues. The limitation “represents at least about 10% of all possible peptide sequences” means approximately 10% or more of the possible peptide sequences of a given length within the range of 3 to 13 L-amino acids where the number of possible peptide sequences is equal to 20 L. “Oligonucleotide” means a compound created by the condensation of typically fewer than 20 nucleotides.
SO ORDERED.