Patent Search

 
 

Breast cancer progression signatures

Abstrict

Methods and compositions for the identification of breast cancer progression signatures are provided. The signature profiles are identified based upon multiple sampling of reference breast tissue samples from independent cases of breast cancer and provide a reliable set of molecular criteria for identification of cells as being in one or more particular stages of breast cancer.

Claims

We claim:

1. An array comprising more than one of the genes in any one of Tables 2-5 hybridized to nucleic acids derived from a cell suspected of being abnormal or malignant.

2. The array of claim 1 wherein more than one comprises more than 5 of the genes in any one of Tables 2-5.

3. The array of claim 1 wherein more than one comprises more than 10 of the genes in any one of Tables 2-5.

4. The array of claim 1 wherein more than one comprises more than 11 of the genes in any one of Tables 2-5.

5. The array of claim 1 wherein said cell is from a subject afflicted with, or suspected of having, breast cancer.

6. The array of claim 5 wherein said subject is human.

7. An array comprising more than one of the genes in Tables 1 or 6 hybridized to nucleic acids derived from a cell suspected of being hyperplastic or cancerous.

8. The array of claim 7 wherein more than one comprises more than 5 of the genes in Tables 1 or 6.

9. The array of claim 7 wherein more than one comprises more than 10 of the genes in Tables 1 or 6.

10. The array of claim 7 wherein more than one comprises more than 11 of the genes in Tables 1 or 6.

11. The array of claim 7 wherein said cell is from a subject afflicted with, or suspected of having, breast cancer.

12. The array of claim 11 wherein said subject is human.

13. An array comprising more than one of the genes in Table 7 hybridized to nucleic acids derived from a DCIS cell.

14. An array comprising more than one of the genes in Table 8 hybridized to nucleic acids derived from a IDC cell.

15. A method to determine the breast cancer stage of a ductal lavage or fine needle aspiration sample from a subject comprising assaying said sample for expression of one or more genes correlated with one or more stages of breast cancer.

14. The method of claim 15 wherein said assaying comprises preparing RNA from said sample.

15. The method of claim 14 wherein said RNA is amplified.

16. The met hod of claim 15 wherein said assaying comprises using an array.

17. The method of claim 15 wherein said assaying comprises using the array of claim 1.

18. The method of claim 15 wherein said one or more genes are correlated with ADH, DCIS, and/or IDC.

19. The method of claim 15 wherein said one or more genese are correlated with normal and abnormal cells.

20. A method to determine breast cancer stage of a cell containing sample from a subject comprising assaying said sample for expression of one or more genes capable of discriminating between two stages of breast cancer.

21. The method of claim 20 wherein said sample is from a subject afflicted with, or suspected of having, breast cancer.

22. The method of claim 20 wherein said subject is human.

23. The method of claim 20 wherein said sample is a microdissected sample.

24. The method of claim 23 wherein said sample is microdissected via laser capture microdissection.

25. The method of claim 20 wherein said one or more genes discriminate between normal and abnormal cells, between cancerous and non-cancerous cells, or between stages of DCIS or IDC.

26. A method to determine therapeutic treatment for a patient determined to have atypical cells in a sample therefrom comprising assaying said cells for expression of genes correlated with non-cancerous and cancerous breast cancer cells, identifying the stage of breast cancer of said cells, and selecting the appropriate treatment for a patient having cells of such a stage.

27. A method to identify genes the expression of which are correlated with one or more stages of breast cancer comprising obtaining multiple homogenous populations of breast cancer cells from each of said one or more stages; identifying more than one gene the expression of which is correlated with said one or more stage by detecting genes with similar expression profiles in more than one of said populations.

Description

FIELD OF THE INVENTION

[0001] The invention relates to the identification and use of gene expression profiles, or patterns, involved in breast cancer progression. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other expression formats, are used in the study and/or diagnosis of cells and tissue during breast cancer progression as well as for the study and/or determination of prognosis of a patient. When used for diagnosis or prognosis, the profiles are used to predict the status and/or phenotype of cells and tissues relative to breast cancer and the treatment thereof.

BACKGROUND OF THE INVENTION

[0002] Breast cancer is by far the most common cancer among women. Each year, more than 180,000 and 1 million women in the U.S. and worldwide, respectively, are diagnosed with breast cancer. Breast cancer is the leading cause of death for women between ages 50-55, and is the most common non-preventable malignancy in women in the Western Hemisphere. An estimated 2,167,000 women in the United States are currently living with the disease (National Cancer Institute, Surveillance Epidemiology and End Results (NCI SEER) program, Cancer Statistics Review (CSR), www-seer.ims.nci.nih.gov/Publications/CSR1973 (1998)). Based on cancer rates from 1995 through 1997, a report from the National Cancer Institute (NCI) estimates that about 1 in 8 women in the United States (approximately 12.8 percent) will develop breast cancer during her lifetime (NCI's Surveillance, Epidemiology, and End Results Program (SEER) publication SEER Cancer Statistics Review 1973-1997). Breast cancer is the second most common form of cancer, after skin cancer, among women in the United States. An estimated 250,100 new cases of breast cancer are expected to be diagnosed in the United States in 2001. Of these, 192,200 new cases of more advanced (invasive) breast cancer are expected to occur among women (an increase of 5% over last year), 46,400 new cases of early stage (in situ) breast cancer are expected to occur among women (up 9% from last year), and about 1,500 new cases of breast cancer are expected to be diagnosed in men (Cancer Facts & Figures 2001 American Cancer Society). An estimated 40,600 deaths (40,300 women, 400 men) from breast cancer are expected in 2001. Breast cancer ranks second only to lung cancer among causes of cancer deaths in women. Nearly 86% of women who are diagnosed with breast cancer are likely to still be alive five years later, though 24% of them will die of breast cancer after 10 years, and nearly half (47%) will die of breast cancer after 20 years.

[0003] Every woman is at risk for breast cancer. Over 70 percent of breast cancers occur in women who have no identifiable risk factors other than age (U.S. General Accounting Office. Breast Cancer, 1971-1991: Prevention, Treatment and Research. GAO/PEMD-92-12; 1991). Only 5 to 10% of breast cancers are linked to a family history of breast cancer (Henderson I C, Breast Cancer. In: Murphy G P, Lawrence W L, Lenhard R E (eds). Clinical Oncology. Atlanta, Ga.: American Cancer Society; 1995:198-219).

[0004] Each breast has 15 to 20 sections called lobes. Within each lobe are many smaller lobules. Lobules end in dozens of tiny bulbs that can produce milk. The lobes, lobules, and bulbs are all linked by thin tubes called ducts. These ducts lead to the nipple in the center of a dark area of skin called the areola. Fat surrounds the lobules and ducts. There are no muscles in the breast, but muscles lie under each breast and cover the ribs. Each breast also contains blood vessels and lymph vessels. The lymph vessels carry colorless fluid called lymph, and lead to the lymph nodes. Clusters of lymph nodes are found near the breast in the axilla (under the arm), above the collarbone, and in the chest.

[0005] Breast tumors can be either benign or malignant. Benign tumors are not cancerous, they do not spread to other parts of the body, and are not a threat to life. They can usually be removed, and in most cases, do not come back. Malignant tumors are cancerous, and can invade and damage nearby tissues and organs. Malignant tumor cells may metastisize, entering the bloodstream or lymphatic system. When breast cancer cells metastisize outside the breast, they are often found in the lymph nodes under the arm (axillary lymph nodes). If the cancer has reached these nodes, it means that cancer cells may have spread to other lymph nodes or other organs, such as bones, liver, or lungs.

[0006] Major and intensive research has been focussed on early detection, treatment and prevention. This has included an emphasis on determining the presence of precancerous or cancerous ductal epithelial cells. These cells are analyzed, for example, for cell morphology, for protein markers, for nucleic acid markers, for chromosomal abnormalities, for biochemical markers, and for other characteristic changes that would signal the presence of cancerous or precancerous cells. This has led to various molecular alterations that have been reported in breast cancer, few of which have been well characterized in human clinical breast specimens. Molecular alterations include presence/absence of estrogen and progesterone steroid receptors, HER-2 expression/amplification (Mark H F, et al. HER-2/neu gene amplification in stages I-IV breast cancer detected by fluorescent in situ hybridization. Genet Med; 1(3):98-103 1999), Ki-67 (an antigen that is present in all stages of the cell cycle except G0 and used as a marker for tumor cell proliferation, and prognostic markers (including oncogenes, tumor suppressor genes, and angiogenesis markers) like p53, p27, Cathepsin D, pS2, multi-drug resistance (MDR) gene, and CD31.

[0007] Examination of cells by a trained pathologist has also been used to establish whether ductal epithelial cells are normal (i.e. not precancerous or cancerous or having another noncancerous abnormality), precancerous (i.e. comprising hyperplasia, atypical ductal hyperplasia (ADH)) or cancerous (comprising ductal carcinoma in situ, or DCIS, which includes low grade ductal carcinoma in situ, or LG-DCIS, and high grade ductal carcinoma in situ, or HG-DCIS) or invasive (ductal) carcinoma (IDC). Pathologists may also identify the occurrence of lobular carcinoma in situ (LCIS) or invasive lobular carcinoma (ILC). Breast cancer progression may be viewed as the occurrence of abnormal cells, such as those of ADH, DCIS, IDC, LCIS, and/or ILC, among normal cells.

[0008] It remains unclear whether normal cells become hyperplastic (such as ADH) and then progressing on to become malignant (DCIS, IDC, LCIS, and/or ILC) or whether normal cells are able to directly become malignant without transitioning through a hyperplastic stage. It has been observed, however, that the presence of ADH indicates a higher likelihood of developing a malignancy. This has resulting in treatment of patients with ADH to begin treatment with an antineoplastic/antitumor agent such as tamoxifen. This is in contrast to the treatment of patients with malignant breast cancer which usually includes surgical removal.

[0009] The rational development of preventive, diagnostic and therapeutic strategies for women at risk for breast cancer would be aided by a molecular map of the tumorigenesis process. Relatively little is known of the molecular events that mediate the transition of normal breast cells to the various stages of breast cancer progression. In particular, there is a significant paucity of information regarding the genetic changes that are associated with the earliest stages of human breast cancer, which include the transition of normal breast cells to atypical hyperplastic and/or pre-invasive malignant cells (carcinoma in situ).

[0010] Molecular means of identifying the differences between normal, non-cancerous cells and cancerous cells (in general) have also been the focus of intense study. The use of cDNA libraries to analyze differences in gene expression patterns in normal versus tumorigenic cells has been described (U.S. Pat. No. 4,981,783). DeRisi et al. (1996) describe the analysis of gene expression patterns between two cell lines: UACC-903, which is a tumorigenic human melanoma cell line, and UACC-903(+6), which is a chromosome 6 suppressed non-tumorigenic form of UACC-903. Labeled cDNA probes made from mRNA from these cell lines were applied to DNA microarrays containing 870 different cDNAs and controls. Genes that were preferentially expressed in one of the two cell lines were identified.

[0011] Golub et al. (1999) describe the use of gene expression monitoring as means to cancer class discovery and class prediction between acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). Their approach to class predictors used a neighborhood analysis followed by cross-validation of the validity of the predictors by withholding one sample and building a predictor based only on the remaining samples. This predictor is then used to predict the class of the withheld sample. They also used cluster analysis to identify new classes (or subtypes) within the AML-and ALL.

[0012] Gene expression patterns in human breast cancers have been described by Perou et al. (1999), who studied gene expression between cultured human mammary epithelia cells (HMEC) and breast tissue samples by use of microarrays comprising about 5000 genes. They used a clustering algorithm to identify patterns of expression in HMEC and tissue samples. Perou et al. (2000) describe the use of clustered gene expression profiles to classify subtypes of human breast tumors. Hedenfalk et al. describe gene expression profiles in BRCA1 mutation positive, BRCA2 mutation positive, and sporadic tumors. Sgroi et al. also analyzed gene expression of normal and breast cancer cells from a single patient. Using gene expression patterns to distinguish breast tumor subclasses and predict clinical implications is described by Sorlie et al. and West et al.

[0013] All of the above described approaches, however, utilize heterogeneous populations of cells found in culture or in a biopsy to obtain information on gene expression patterns. The use of such populations may result in the inclusion or exclusion of multiple genes from the patterns. For this and the lack of statistical robustness reasons, the gene expression patterns observed by the above described approaches provide little confidence that the differences in gene expression may be meaningfully associated with the stages of breast cancer.

SUMMARY OF THE INVENTION

[0014] The present invention relates to the identification and use of gene expression patterns (or profiles or "signatures") which are correlated with (and thus able to discriminate between) cells in various stages of breast cancer. Broadly defined, these stages are non-malignant versus malignant, but may also be viewed as normal versus atypical (optionally including reactive and pre-neoplastic) versus cancerous. Another definition of the stages is normal versus precancerous (e.g. a typical ductal hyperplasia (ADH) or atypical lobular hyperplasia (ALH)) versus cancerous (e.g. carcinoma in situ such as DCIS and/or LCIS) versus invasive (e.g. carcinomas such as IDC and/or ILC). DCIS may be further viewed as low grade versus high grade or grade I through grade III.

[0015] The gene expression patterns comprise one or more than one gene capable of discriminating between various stages of breast cancer with significant accuracy. The gene(s) are identified as correlated with various stages of breast cancer such that the levels of their expression are relevant to a determination of the stage of breast cancer of a cell. Thus in one aspect, the invention provides a method to determine the stage of breast cancer of a subject afflicted with, or suspected of having, breast cancer by assaying a cell containing sample from said subject for expression of one or more than one gene disclosed herein as correlated with one or more stages of breast cancer.

[0016] Gene expression patterns of the invention are identified by analysis of gene expression in multiple samples of each stage to be studied. The overall gene expression profile of each sample is obtained by analyzing the expressed or unexpressed state of genes in each stage relative to each other (one gene to another across all genes). This overall profile is then analyzed to identify genes that are positively, or negatively, correlated, with a stage of breast cancer relative to other genes. An expression profile of a subset of human genes may then be identified by the methods of the present invention as correlated with breast cancer. The use of multiple samples increases the confidence which which a gene may be believed to be correlated with a particular stage. Without sufficient confidence, it remains unpredictable whether a particular gene is actually correlated with a stage of breast cancer and also unpredictable whether a particular gene may be successfully used to identify the stage of an unknown breast cancer cell sample.

[0017] A profile of genes that are highly correlated with one stage relative to another may be used to assay an sample from a subject afflicted with, or suspected of having, breast cancer to identify the stage of breast cancer to which the sample belongs. Such an assay may be used as part of a method to determine the therapeutic treatment for said subject based upon the stage(s) of breast cancer identified.

[0018] The correlated genes may be used singly with significant accuracy or in combination to increase the ability to accurately discriminate between various stages of breast cancer. The present invention thus provides means for correlating a molecular expression phenotype with a physiological (cellular) stage or state. This correlation provides a way to molecularly diagnose and/or monitor a cell's status in comparison to different cancerous versus non-cancerous phenotypes as disclosed herein. Additional uses of the correlated gene(s) are in the classification of cells and tissues; determination of diagnosis and/or prognosis; and determination and/or alteration of therapy.

[0019] The ability to discriminate is conferred by the identification of expression of the individual genes as relevant and not by the form of the assay used to determine the actual level of expression. An assay may utilize any identifying feature of an identified individual gene as disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression of the gene. Identifying features include, but are not limited to, unique nucleic acid sequences used to encode (DNA), or express (RNA), said gene or epitopes specific to, or activities of, a protein encoded by said gene. All that is required is the identity of the gene(s) necessary to discriminate between stages of breast cancer and an appropriate cell containing sample for use in an expression assay.

[0020] In one aspect, the invention provides for the identification of the gene expression patterns by analyzing global, or near global, gene expression from single cells or homogenous cell populations which have been dissected away from, or otherwise isolated or purified from, contaminating cells beyond that possible by a simple biopsy. Because the expression of numerous genes fluctuate between cells from different patients as well as between cells from the same patient sample, multiple individual gene expression patterns are used as reference data to generate models which in turn permit the identification of individual gene(s) that are most highly correlated with particular breast cancer stages and/or have the best the ability to discriminate cells of one stage from another.

[0021] In another aspect, the invention provides physical and methodological means for detecting the expression of gene(s) identified by the models generated by individual expression patterns. These means may be directed to assaying one or more aspect of the DNA template(s) underlying the expression of the gene(s), of the RNA used as an intermediate to express the gene(s), or of the proteinaceous product expressed by the gene(s).

[0022] In a further aspect, the gene(s) identified by a model as capable of discriminating between breast cancer stages may be used to identify the cellular state of an unknown sample of cell(s) from the breast. Preferably, the sample is isolated via non-invasive means. The expression of said gene(s) in said unknown sample may be determined and compared to the expression of said gene(s) in reference data of gene expression patterns from the various stages of breast cancer. Optionally, the comparison to reference samples may be by comparison to the model(s) constructed based on the reference samples.

[0023] One advantage provided by the present invention is that contaminating, non-breast cells (such as infiltrating lymphocytes or other immune system cells) are not present to possibly affect the genes identified or the subsequent analysis of gene expression to identify the status of suspected breast cancer cells. Such contamination is present where a biopsy is used to generate gene expression profiles.

[0024] While the present invention has been described mainly in the context of human breast cancer, it may be practiced in the context of breast cancer of any animal known to be potentially afflicted by breast cancer. Preferred animals for the application of the present invention are mammals, particularly those important to agricultural applications (such as, but not limited to, cattle, sheep, horses, and other "farm animals") and for human companionship (such as, but not limited to, dogs and cats).

BRIEF DESCRIPTION OF THE FIGURES

[0025] FIG. 1 is a schematic representing a data matrix of a pair-wise comparison between Grade I and Grade III DCIS among 16 samples (across the top) and a large number of genes identified by "CloneID") along the left hand side.

[0026] FIG. 2 is a table showing the actual weight data corresponding to Example II, where the data from ten genes (by CloneID number vertically) are compared to DCIS and ADH samples (across the top). Some data in the table has been vertically presented to permit the table to be displayed on a single sheet. The use of "-" with data in the table reflects genes that are more highly expressed in ADH relative to DCIS. The absence of "-" reflects genes that are more highly expressed in DCIS relative to ADH.

[0027] FIG. 3 is a table showing the actual weight data corresponding to Example VII, where the data from over 300 genes (by CloneID number vertically) are compared to DCIS and ADH samples (across the top). Some data in the table has been vertically presented solely for display purposes. The use of "-" with data in the table reflects genes that are more highly expressed in ADH relative to DCIS. The absence of "-" reflects genes that are more highly expressed in DCIS relative to ADH.

[0028] FIG. 4 is a table showing the actual weight data corresponding to Example VIII, where the data from over 300 genes (by CloneID number vertically) are compared to samples (across the top) from two grades of DCIS. The use of "-" with data in the table reflects genes that are more highly expressed in grade I relative to grade III. The absence of "-" reflects genes that are more highly expressed in grade III relative to grade I.

DETAILED DESCRIPTION OF THE SPECIFIC EMBODIMENTS

[0029] Definitions of Terms as Used Herein:

[0030] A gene expression "pattern" or "profile" or "signature" refers to the relative expression of a gene between two or more stages of breast cancer which is correlated with being able to distinguish between said stages.

[0031] A "gene" is a polynucleotide that encodes a discrete product, whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. The term includes alleles and polymorphisms of a gene that encodes the same product, or a functionally associated (including gain, loss, or modulation of function) analog thereof, based upon chromosomal location and ability to recombine during normal mitosis.

[0032] A "stage" or "stages" (or equivalents thereof) of breast cancer refer to a physiologic state of a breast cell as defined by known cytological or histological (including immunohistology, histochemistry, and immunohistochemistry) procedures and are readily known to skilled in the art. Non-limiting examples include normal versus abnormal, non-cancerous versus cancerous, the different stages described herein (e.g. hyperplastic, carcinoma, and invasive), and grades within different stages (e.g. grades I, II, or III or the equivalents thereof within cancerous stages).

[0033] The terms "correlate" or "correlation" or equivalents thereof refer to an association between expression of one or more genes and a physiologic state of a breast cell to the exclusion of one or more other stages and/or identified by use of the methods as described herein. A gene may be expressed at higher or lower levels and still be correlated with one or more breast cancer stages.

[0034] A "polynucleotide" is a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double- and single-stranded DNA and RNA. It also includes known types of modifications including labels known in the art, methylation, "caps", substitution of one or more of the naturally occurring nucleotides with an analog, and internucleotide modifications such as uncharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), as well as unmodified forms of the polynucleotide.

[0035] The term "amplify" is used in the broad sense to mean creating an amplification product can be made enzymatically with DNA or RNA polymerases. "Amplification," as used herein, generally refers to the process of producing multiple copies of a desired sequence, particularly those of a sample. "Multiple copies" mean at least 2 copies. A "copy" does not necessarily mean perfect sequence complementarity or identity to the template sequence.

[0036] By corresponding is meant that a nucleic acid molecule shares a substantial amount of sequence identity with another nucleic acid molecule. Substantial amount means at least 95%, usually at least 98% and more usually at least 99%, and sequence identity is determined using the BLAST algorithm, as described in Altschul et al. (1990), J. Mol. Biol. 215:403-410 (using the published default setting, i.e. parameters w=4, t=17). Methods for amplifying mRNA are generally known in the art, and include reverse transcription PCR (RT-PCR) and those described in U.S. patent application (number to be assigned) entitled "Nucleic Acid Amplification" filed on Oct. 25, 2001 as attorney docket number 485772002900 as well as U.S. Provisional Patent Applications 60/298,847 (filed Jun. 15, 2001) and 60/257,801 (filed Dec. 22, 2000), all of which are hereby incorporated by reference in their entireties as if fully set forth. Alternatively, RNA may be directly labeled as the corresponding cDNA by methods known in the art.

[0037] A "microarray" is a linear or two-dimensional array of preferably discrete regions, each having a defined area, formed on the surface of a solid support such as, but not limited to, glass, plastic, or synthetic membrane. The density of the discrete regions on a microarray is determined by the total numbers of immobilized polynucleotides to be detected on the surface of a single solid phase support, preferably at least about 50/cm.sup.2, more preferably at least about 100/cm.sup.2, even more preferably at least about 500/cm.sup.2, but preferably below about 1,000/cm.sup.2. Preferably, the arrays contain less than about 500, about 1000, about 1500, about 2000, about 2500, or about 3000 immobilized polynucleotides in total. As used herein, a DNA microarray is an array of oligonucleotides or polynucleotides placed on a chip or other surfaces used to hybridize to amplified or cloned polynucleotides from a sample. Since the position of each particular group of primers in the array is known, the identities of a sample polynucleotides can be determined based on their binding to a particular position in the microarray.

[0038] Because the invention relies upon the identification of genes that are over- or under-expressed, one embodiment of the invention involves determining expression by hybridization of mRNA, or an amplified or cloned version thereof, of a sample cell to a polynucleotide that is unique to a particular gene sequence. Preferred polynucleotides of this type contain at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, or at least about 32 consecutive basepairs of a gene sequence that is not found in other gene sequences. The term "about" as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Even more preferred are polynucleotides of at least about 50, at least about 100, and at least about 150 basepairs of a gene sequence that is not found in other gene sequences. The term "about" as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value.

[0039] Alternatively, and in another embodiment of the invention, gene expression may be determined by analysis of expressed protein in a cell sample of interest by use of one or more antibodies specific for one or more epitopes of individual gene products (proteins) in said cell sample. Such antibodies are preferably labeled to permit their easy detection after binding to the gene product.

[0040] The term "label" refers to a composition capable of producing a detectable signal indicative of the presence of the labeled molecule. Suitable labels include radioisotopes, nucleotide chromophores, enzymes, substrates, fluorescent molecules, chemiluminescent moieties, magnetic particles, bioluminescent moieties, and the like. As such, a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.

[0041] The term "support" refers to conventional supports such as beads, particles, dipsticks, fibers, filters, membranes and silane or silicate supports such as glass slides.

[0042] As used herein, a "breast tissue sample" or "breast cell sample" refers to a sample of breast tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, breast cancer. Such samples are primary isolates (in contrast to cultured cells) and may be collected by any non-invasive means, including, but not limited to, ductal lavage, fine needle aspiration, needle biopsy, the devices and methods described in U.S. Pat. No. 6,328,709, or any other suitable means recognized in the art. Alternatively, the "sample" may be collected by an invasive method, including, but not limited to, surgical biopsy.

[0043] "Expression" and "gene expression" include transcription and/or translation of nucleic acid material.

[0044] As used herein, the term "comprising" and its cognates are used in their inclusive sense; that is, equivalent to the term "including" and its corresponding cognates.

[0045] Conditions that "allow" an event to occur or conditions that are "suitable" for an event to occur, such as hybridization, strand extension, and the like, or "suitable" conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event. Such conditions, known in the art and described herein, depend upon, for example, the nature of the nucleotide sequence, temperature, and buffer conditions. These conditions also depend on what event is desired, such as hybridization, cleavage, strand extension or transcription.

[0046] Sequence "mutation," as used herein, refers to any sequence alteration in the sequence of a gene disclosed herein interest in comparison to a reference sequence. A sequence mutation includes single nucleotide changes, or alterations of more than one nucleotide in a sequence, due to mechanisms such as substitution, deletion or insertion. Single nucleotide polymorphism (SNP) is also a sequence mutation as used herein. Because the present invention is based on the relative level of gene expression, mutations in non-coding regions of genes as disclosed herein may also be assayed in the practice of the invention.

[0047] "Detection" includes any means of detecting, including direct and indirect detection of gene expression and changes therein. For example, "detectably less" products may be observed directly or indirectly, and the term indicates any reduction (including the absence of detectable signal). Similarly, "detectably more" product means any increase, whether observed directly or indirectly.

[0048] Unless defined otherwise all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs.

[0049] Specific Embodiments

[0050] The present invention relates to the identification and use of gene expression patterns (or profiles or "signatures") which discriminate between (or are correlated with) cells in various stages of breast cancer. Such patterns may be determined by the methods of the invention by use of a number of reference cell or tissue samples, such as those reviewed by a pathologist of ordinary skill in the pathology of breast cancer, which reflect various stages of breast cancer. Because the overall gene expression profile differs from person to person, cancer to cancer, and cancer cell to cancer cell, correlations between certain cell states and genes expressed or underexpressed may be made as disclosed herein to identify genes that are capable of discriminating between different breast cancer states.

[0051] The present invention may be practiced with any number of genes believed, or likely to be, differentially expressed in breast cancer cells. In Example I below, approximately 12,000 genes were used to identify hundreds of genes capable of discriminating between various stages of breast cancer as shown in Examples 2-9. The identification may be made by using expression profiles of various homogenous normal and breast cancer cell populations, which were isolated by microdissection, such as, but not limited to, laser capture microdissection (LCM) of 100-1000 cells. Each gene of the expression profile may be assigned weights based on its ability to discriminate between two or more stages of breast cancer (see Example I). The magnitude of each assigned weight indicates the extent of difference in expression between the two groups and is an approximation of the ability of expression of the gene to discriminate between the two groups (and thus stages). The magnitude of each assigned weight also approximates the extent of correlation between expression of individual gene(s) and particular breast cancer stages.

[0052] It should be noted that merely high levels of expression in cells from a particular stage or stages does not necessarily mean that a gene will be identified as having a high absolute weight value.

[0053] Genes with top ranking weights (in absolute terms) may be used to generate models of gene expressions that would maximally discriminate between the two groups. Alternatively, genes with top ranking weights (in absolute terms) may be used in combination with genes with lower weights without signficant loss of ability to discriminate between groups. Such models may be generated by any appropriate means recognized in the art, including, but not limited to, cluster analysis, supported vector machines, neural networks or other algorithm known in the art. The models are capable of predicting the classification of a unknown sample based upon the expression of the genes used for discrimination in the models. "Leave one out" cross-validation may be used to test the performance of various models and to help identify weights (genes) that are uninformative or detrimental to the predictive ability of the models. Cross-validation may also be used to identify genes that enhance the predictive ability of the models.

[0054] The gene(s) identified as correlated with particular breast cancer stages by the above models provide the ability to focus gene expression analysis to only those genes that contribute to the ability to identify a cell as being in a particular stage of breast cancer relative to another stage or stages. The expression of other genes in a breast cancer cell would be relatively unable to provide information concerning, and thus assist in the discrimination of, different stages of breast cancer. For example, the alpha subunit of human topoisomerase II (identified by CloneID 825470) has been found to be useful in discriminations between normal and atypical cells (ADH and DCIS and IDC and LCIS), between normal and ADH cells compared to DCIS and IDC cells, between normal and DCIS cells, between ADH and DCIS cells, between grade I and grade III DCIS cells, and between grade I and grade III IDC cells but not between normal and ADH cells (see Examples II to IX below). Thus expression of this topoisomerase II subunit would be utilized in models to discriminate between the above listed stages but not for discerning normal from ADH cells. This type of analysis is readily incorporated into algorithms used to generate models with reference gene expression data.

[0055] As will be appreciated by those skilled in the art, the models are highly useful with even a small set of reference gene expression data and can become increasingly accurate with the inclusion of more reference data although the incremental increase in accuracy will likely diminish with each additional datum. The preparation of additional reference gene expression data using genes identified and disclosed herein for discriminating between different stages of breast cancer is routine and may be readily performed by the skilled artisan to permit the generation of models as described above to predict the status of an unknown sample based upon the expression levels of those genes.

[0056] To determine the expression levels of genes in the practice of the present invention, any method known in the art may be utilized. In one preferred embodiment of the invention, expression based on detection of RNA which hybridizes to the genes identified and disclosed herein is used. This is readily performed by any RNA detection or amplification+detection method known or recognized as equivalent in the art such as, but not limited to, reverse transcription-PCR, the methods disclosed in U.S. patent application (number to be assigned) entitled "Nucleic Acid Amplification" filed on Oct. 25, 2001 as attorney docket number 485772002900 as well as U.S. Provisional Patent Applications 60/298,847 (filed Jun. 15, 2001) and 60/257,801 (filed Dec. 22, 2000), and methods to detect the presence, or absence, of RNA stabilizing or destabilizing sequences.

[0057] Alternatively, expression based on detection of DNA status may be used. Detection of the DNA of an indentified gene as methylated or deleted may be used for genes that have decreased expression in correlation with a particular breast cancer stage. This may be readily performed by PCR based methods known in the art. Conversely, detection of the DNA of an indentified gene as amplified may be used for genes that have increased expression in correlation with a particular breast cancer stage. This may be readily performed by PCR based, fluorescent in situ hybridization (FISH) and chromosome in situ hybridization (CISH) methods known in the art.

[0058] Expression based on detection of a presence, increase, or decrease in protein levels or activity may also be used. Detection may be performed by any immunohistochemistry (IHC) based, blood based (especially for secreted proteins), antibody (including autoantibodies against the protein) based, ex foliate cell (from the cancer) based, mass spectroscopy based, and image (including used of labeled ligand) based method known in the art and recognized as appropriate for the detection of the protein. Antibody and image based methods are additionally useful for the localization of tumors after determination of cancer by use of cells obtained by a non-invasive procedure (such as ductal lavage or fine needle aspiration), where the source of the cancerous cells is not known. A labeled antibody or ligand may be used to localize the carcinoma(s) within a patient.

[0059] A preferred embodiment using a nucleic acid based assay to determine expression is by immobilization of one or more of the genes identified herein on a solid support, including, but not limited to, a solid substrate as an array or to beads or bead based technology as known in the art. Alternatively, solution based expression assays known in the art may also be used. The immobilized gene(s) may be in the form of polynucleotides that are unique or otherwise specific to the gene(s) such that the polynucleotide would be capable of hybridizing to a DNA or RNA corresponding to the gene(s). These polynucleotides may be the full length of the gene(s) or be short sequences of the genes that are optionally minimally interrupted (such as by mismatches or inserted non-complementary basespairs) such that hybridization with a DNA or RNA corresponding to the gene(s) is not affected.

Glypican-1 in human breast cancer
Compositions and methods for therapy and diagnosis of breast cancer
Mammaglobin, a mammary-specific breast cancer protein
Method of diagnosing breast cancer and compositions therefor
Use of anastrozole for the treatment of post-menopausal women having early breast cancer
Compositions and methods for the therapy and diagnosis of breast cancer
Treatment of breast cancer
Apparatus and method for breast cancer imaging


20040142361 Compositions and methods for the therapy and diagnosis of breast cancer
20040142328 Method and cloning vector for preparing multiple-gene diagnostic probes for the assessment of multiple markers for breast cancer prognosis
20040132118 Method of diagnosing breast cancer
20040101899 Compositions and methods for the therapy and diagnosis of breast cancer
20040097811 Apparatus and method for diagnosing breast-cancer including examination table
20040091423 Methods for identifying treating or monitoring asymptomatic patients for risk reduction or therapeutic treatment of breast cancer
20040087586 2-Heterosubstituted 3-aryl-4H-1-benzopyran-4-ones as novel therapeutics in breast cancer
20040086504 Cyr61 as a target for treatment and diagnosis of breast cancer
20040073016 Compositions and methods for the therapy and diagnosis of breast cancer
20040058340 Diagnosis and prognosis of breast cancer patients
20040048258 Multiple-gene diagnostic probes and assay kits and method for the assessment of multiple markers for breast cancer prognosis
20040043950 WT1 antisense oligos for the inhibition of breast cancer
20040033230 Compositions and methods for the therapy and diagnosis of breast cancer
20040029114 Methods of diagnosis of breast cancer, compositions and methods of screening for modulators of breast cancer
20040014051 Antisense modulation of breast cancer-1 expression
20040005644 Method and composition for detection and treatment of breast cancer
20040002067 Breast cancer progression signatures
20030148364 Predisposition to breast cancer by mutations at the ataxia-telangiectasia genetic locus
20030144259 Breast cancer hormonal therapy
20030143546 Breast cancer transcription factor gene and uses
20030125640 Minimally invasive treatment for breast cancer
20030124543 Breast cancer marker
20030124128 Compositions, kits, and methods for identification, assessment, prevention, and therapy of breast cancer
20030108888 Breast cancer antigens
20030104418 Diagnostic markers for breast cancer
20030104366 Compositions for the treatment and diagnosis of breast cancer and methods for their use
20030103980 Glypican-1 in human breast cancer
20030099974 Novel genes, compositions, kits and methods for identification, assessment, prevention, and therapy of breast cancer

Copyright © 2006 - 2015 Patent Information Search