|
|
|
Home » Overview People Publications Projects Seminars Meetings Careers Links Contacts |
Past ICBP Seminars
Thursday Oct 22nd, 2009 @ 4pm (Alway M108)
Daniela Witten, Stanford University
Sparse clustering: a method for clustering observations in high dimensions
Suppose that we wish to hierarchically cluster say 100 observations on
the basis of 30,000 gene expression measurements. One can either use
all 30,000 genes or filter down to a smaller subset. This subjective
decision often has a big impact on the resulting clustering. For
example, the underlying subgroups that we seek to discover by
clustering may be defined on only a subset of the genes, and the
remaining genes are "noise" genes or are part of pathways that are not
relevant to the subgroups. If we perform ordinary hierarchical
clustering with all of the genes, then the presence of unrelated genes
can obscure the true underlying subgroups, resulting in an
uninteresting clustering. We propose a method to adaptively (1) choose
a subset of genes to use in the clustering, and (2) obtain a more
accurate or interesting clustering on the basis of these genes. We
demonstrate the use of this method on a breast cancer gene
expression data set in which 4 breast cancer subgroups are discovered
(Perou et al., Nature 2000).
Tuesday Sept 9th, 2009 @ 4pm (Clark S362)
Dr. Abhishek Garg, Ecole Polytechnique Federale de Lausanne
Implicit methods for modeling gene regulatory networks
Abstract: Advancements in high-throughput technologies has resulted in a new
field of network biology where cellular behavior is modeled using
network structures that represent the influence of different
biological entities on each other giving rise to a dynamical system.
Technical challenges that are unique to biology as well as the need to
model large regulatory networks comprising of over thousands of nodes have
renewed interest in developing new computational techniques for
modeling complex dynamical systems in biology.
In this talk, I will present my work on developing a modeling
framework for such regulatory networks in biology based on Boolean
algebra and finite-state machines that are reminiscent of the approach
used for digital circuit synthesis and simulations in the field of
very-large-scale integration (VLSI). The proposed formalism enables a
common mathematical framework to develop computational techniques for
modeling different aspects of the regulatory networks. Based on the
above formalism, I will present algorithms that can be implemented
using implicit representation techniques based on binary decision
diagrams (BDDs) for efficient modeling of large gene regulatory
networks.
Thursday July 23rd, 2009 @ 4pm (Clark S363)
Dr. Peng Qiu, Stanford University
Sample progression analysis for microarray data
We proposed a computational framework to study the progression of biological processes, such as cancer progression, cell differentiation, etc. For example, cancer samples collected from individual patients may represent different stages of cancer progression; the relationship among these samples may lay out a pathway or trajectory of cancer progression. Our method is able to discover the pathway of the progression, and at the same time identify the markers that best define/reflect the progression.
The proposed progression analysis framework starts with gene clustering. We use consensus algorithm to derive consistent coherent gene modules, where genes within one module are highly co-expressed. Assuming that progression can be reflected by the gradual change/drift of the expression of a subset of genes, we propose to use minimum spanning trees to extract the progression order of microarray samples. For each gene module, we construct one minimum spanning tree, which describes the progression order among the samples defined by this gene module. Statistical analysis is then applied to evaluate the fitness between all the modules and all the trees. If there exist multiple modules that fit well with multiple trees, these multiple modules are similar in the sense that they describe a common progression order of the microarray samples. Modules that are similar in our progression analysis do not have to be correlated, which means our framework is able to identify similarities that correlation or regression types of analyses can not. The common progression order supported by multiple modules is highly likely to be biologically meaningful.
We applied the proposed framework to several cell cycle datasets, a b-cell differentiation dataset, and the Glas data that contains paired FL and DLBCL samples. Our method was able to correctly recover the progression order of samples and identify the genes that reflect the progression order.
Thursday March 26th, 2009 @ 4pm (Clark S361)
Dr. Denis Bronnikov, Nextbio, Cupertino, CA
Effective Data Exploitation: The Key to Accelerating Research
Many public repositories of biological data exist, and the amount of
data within them is massive and rapidly growing. Effectively utilizing
this data can be extremely difficult for many reasons. Available data is
spread across multiple, disconnected databases (GEO, caBIG, Array
Express, etc.) with non-standard annotations. The diversity of organisms
and platforms makes drawing connections difficult. Furthermore,
comparing personal data to public work is inaccessible to many
experimentalists. The pace of research could be substantially
accelerated if this public data could be easily accessed, correlated,
and interpreted. Dr. Bronnikov will discuss approaches to effective data
exploitation using NextBio’s semantic search technologies.
Thursday February 26th, 2009 @ 4pm (Clark S361)
Andrew Gentles, Stanford ICBP
A pluripotency signature predicts histological transformation and influences survival in follicular lymphoma patients
Histological transformation of follicular (FL) to diffuse large B cell lymphoma (DLBCL) is associated with accelerated disease course and drastically worse outcome, yet the underlying mechanisms are poorly understood. Using gene module network analysis of expression profiles of FL and DLBCL patients, we show that a modular regulatory hierarchy underlies histological transformation (HT). Expression programs associated with HT reflect variation in infiltrating T cells, cellular differentiation states, proliferative drive, and mitochondrial metabolism. Central to the module network hierarchy is a signature that is strikingly enriched for pluripotency-related genes that are typically expressed in embryonic stem cells (ESC), including MYC and its direct targets. This core ESC-like program was independent of proliferation/cell-cycle and overlapped, but was distinct from, normal B-cell transcriptional programs. A three-module survival model based on ESC programs could stratify patient outcomes in the training dataset as well as in an independent testing cohort of FL patients. The survival model was also predictive of propensity of the FL tumor to transform to DLBCL. Together, these findings suggest a central role for this ESC-like signature in the mechanism of HT and provide new clues for potential therapuetic targets.
Thursday November 20th, 2008 @ 1pm (Clark S361)
Paul Huang, MIT
"Unraveling Oncogenic EGFR Signaling Networks in Brain Tumors using
Quantitative Phosphoproteomics"
Glioblastoma multiforme (GBM) is the most aggressive adult brain tumor
and remains incurable despite multimodal intensive treatment regimens.
EGFRvIII is a truncated extracellular mutant of the EGF receptor (EGFR)
commonly found in GBMs that confers tumorigenic behavior. Although much
work has been done over the past decade to elucidate pathways involved
in EGFRvIII receptor signaling, the global map of the signaling networks
that it activates remains incomplete, making it difficult to assess
downstream components involved in EGFRvIII-mediated transformation. To
gain a molecular understanding of the mechanisms by which EGFRvIII acts,
we have employed mass spectrometry to quantitatively map cellular
signaling events activated by this receptor. I will focus on our
efforts in using quantitative phosphoproteomic approaches as
a tool for understanding the molecular basis of multiple aspects of
EGFRvIII tumor biology, including an unexpected network compensation
mechanism at the level of receptor phosphorylation. The use of
integrative systems biology as a means of novel drug target discovery
for the treatment of GBM will also be discussed.
Thursday August 28th, 2008 @ 4pm (Clark S362
Anna Lapuk, Lawrence Berkeley National Laboratories
Microarrays and next generation sequencing identifies alternatively spliced markers and therapeutic
targets in breast cancer
Alternative splicing (AS) is a major mechanism of gene expression regulation. More than half of all human genes undergo AS providing rich source for proteomic diversity. Many cancer genes have been shown to express different splice forms with properties distinct from the normal forms and beneficial for tumorigenesis. These isoforms may contain cancer-specific epitopes ideal for therapeutic applications. Here I will discuss a whole-genome detection of differential splicing in breast cancer using microarray profiling of a collection of ~30 cell lines. Using novel state-of-the-art statistical methods we have developed a computational pipeline for interrogation of splicing in microarray data. The robustness of our approach was demonstrated by >90% success rate of experimental validation of AS predictions using RT-PCR. Further, we have sequenced transcriptomes of two cell lines (MCF7 and BT574) using Illumina sequencing technology and used this data for high thoroughput validation of microarray predictions, significantly increasing the power of differential splicing detection. Functional annotation of the observed differential splicing events has provided important insights into the role of splicing in the global regulation of gene expression and cellular processes, and into the possible mechanisms of AS regulation. Pathways enriched with differentially spliced genes included Integrin Signaling, Tight Junction Signaling, Ephrin Receptor Signaling among others, which are known to be important players in breast tissue functions. We have observed protein networks highly saturated with the differentially spliced genes. These networks were associated with cytoskeleton organization, biogenesis and cell signaling.
Thursday June 26th, 2008 @ 4pm (Clark S361)
Su-In Lee, Stanford Computer Science
Machine learning approaches for understanding the genetic basis of complex traits
Humans differ in many "phenotypes" such as weight, hair color and more importantly disease susceptibility. These phenotypes are largely determined by each individual's specific genotype, stored in the 3.2 billion bases of his or her genome sequence. Deciphering the sequence by finding which sequence variations cause a certain phenotype would have a great impact. The recent advent of high-throughput genotyping methods has enabled retrieval of an individual's sequence information on a genome-wide scale. Classical approaches have focused on identifying which sequence variations are associated with a particular phenotype. However, the complexity of cellular mechanisms, through which sequence variations cause a particular phenotype, makes it difficult to directly infer such causal relationships. In this talk, I will present machine learning approaches that address these challenges by explicitly modeling the cellular mechanisms induced by sequence variations. Our approach takes as input genome-wide expression measurements and aims to generate a finer-grained hypothesis such as "sequence variations S induces cellular processes M, which lead to changes in the phenotype P". Furthermore, we have developed the "meta-prior algorithm" which can learn the regulatory potential of each sequence variation based on their intrinsic characteristics. This improvement helps to identify a true causal sequence variation among very many variations in the same chromosomal region. Our approaches have led to novel insights on sequence variations, and some of the hypotheses have been validated through biological experiments. Many of the machine learning techniques are generally applicable to a wide-ranging set of applications, and as an example I will present the meta-prior algorithm in the context of movie rating prediction tasks using the Netflix data set.
Thursday April 10th, 2008 @ 4pm (Clark S362)
Debashis Sahoo, Electrical Engineering and Computer Science, Stanford
Boolean analysis of gene expression datasets
The latest advances in DNA microarray technology and its ability to measure
expression of thousands of genes in a single experiment efficiently have
been the main driving force of current biology. As a result, the 21st century is
just beginning to receive the huge explosion in the amount of biological
information that are being collected for various human diseases and other
biological samples. There is a need for a scalable approach for the analysis
and the organization of this vast amount of information into usable
forms. We introduce StepMiner, a tool for the analysis of timecourse microarray
data that discovers stepwise changes in the timecourse. StepMiner groups genes whose
expressions change at the same direction and at the same time.
Additionally, we present the use of Boolean implication relationships for mining massive
amounts of publicly available gene-expression microarray datasets.
The Boolean analysis results are easily understandable and directly interpretable
upon the inspection of the heatmap and the scatterplots. We demonstrate how Boolean
analysis is used to understand gene regulation, gene function and
various human diseases. We have also formally verified new biological hypothesis that
were generated using our Boolean implication network.
Thursday March 6th, 2008 @ 4pm (Clark S362)
Marc Schaub, Computer Science, Stanford
In this talk, we introduce the Qualitative Networks framework, an
approach for constructing computational models of biological systems
that extends the framework of Boolean networks and uses formal
verification methods for the analysis of the model. An efficient
symbolic algorithm can be used to verify that all stable states of such
a model are consistent with a set of requirements derived from the
laboratory experimental data.
We present an application of this approach to the crosstalk between the
Notch and Wnt pathways in mammalian skin. We show how our approach
allows us to formulate a new biological hypothesis that has been
validated experimentally.
Thursday November 1st, 2007 at 4pm (Alway M114)
J. Christopher Anderson, UC Berkeley
Design of Therapeutic Bacteria
The concept of using live bacteria as therapeutic agents has existed
for well over a century and has led to the common use of Mycobacterium
bovis BCG to treat bladder cancer. The attraction of this approach
stems from the naturally evolved ability of pathogens to interact with
animal systems and the observation that specific natural isolates show
therapeutic benefits even without modification. The advent of genetic
engineering rebooted interest in these strategies and has resulted in
the development of live vaccines, immunostimulatory drugs, and cancer
therapies with varied success. The limitations in this area have been
the effectiveness of treatment, the robustness to animal and patient
variability, and side effects. These limitations of traditional
approaches make biological therapeutics an ideal testbed for the
emerging field of synthetic biology, a reductionist approach to
genetic engineering where complex dynamic systems are built ground-up
from well-characterized cellular chasses and functional parts. I will
describe the principles of synthetic biology and their application in
the construction of tumor-killing bacteria."TBA"
Thursday September 27th, 2007 at 4pm (Clark S361)
Rich Neve, Genentech
Systems biology for target identification and rational therapeutic
selection in breast cancer
The advent of high-throughput, high density genomic technologies has fueled
an exponential growth of information describing the molecular features of
disease. The fact that many cancers can now be divided into multiple
subtypes based on gene expression, genomic portraits or molecular pathways
has given biomedical research the ability to realize the dream of rational,
individualized treatment for patients based on molecular profiling. Critical
questions now being investigated include; (i) identifying novel therapeutic
targets; (ii) accurate diagnosis to select patients for targeted treatments;
(iii) understanding mechanisms of therapeutic resistance; (iv) identifying
optimal drug combinations.
This presentation will describe how we use a systems-biology approach to
begin to address these questions, and how a cell-based system may be used to
model many aspects of cancer biology.
Tuessday June 26th, 2007 at 12 pm (Clark Center S361)
Peng Qiu, Dept of Electrical and Computer Engineering, University of Maryland
"Dependence Model and Network for Cancer Classification, Prediction and
Biomarker Identification"
In recent years, high throughput measuring techniques (gene microarrays,
protein mass spectrum) have made it possible to simultaneously monitor the expression of thousands of genes or proteins. A topic of great interest is to study the difference of gene/protein expressions between normal and cancer subjects. In literature, various clustering methods have been proposed to analyze gene/protein data, and they are dominantly data-driven.
In our work, we proposed an alternative model-driven approach. Our aim is todevelop statistical models to systematically interpret the high throughput experiment data and reveal biology insights. We propose a dependence model which can be used to examine the interactions among genes/proteins: we can zoom out to study the big picture, the ensemble dependence relationships among groups of genes/proteins; we can also zoom in to examine the details, the relationship among individual genes/proteins. We have shown that the dependence model is highly effective in the classification of data from
normal and cancer subjects. The dependence model carries biology meanings in the eigenvalue domain. Scine the eigenvalue of the dependence model exhibits different patterns for normal and subjects at different stages of cancer development, the dependence model has the potential to predict cancer development. The concept of dependence network is proposed based on the dependence model. The interaction relationships among genes/proteins are modeled by the dependence network, from which we are able to reliably identify biomarkers, important genes/proteins for the early prediction and
effective treatment of cancer.
Thursday June 28th, 2007 at 4 pm (Clark Center S361)
Narges Bani Asadi, Stanford University
Bayesian Structure Learning on Berkeley Emulation Engine 2
The study of Signal Transduction Networks (STN) is one of the major subjects of interest in Systems Biology. Signal transduction pathways are means of regulating numerous cellular functions in response to changes in the cell's chemical or physical environment.
We use the Bayesian network statistical framework to infer the causal interactions between the molecules from the experimental data. Bayesian Networks have been studied intensively in the last decade as a strong tool to learn causal relations between variables. The study of large networks, however, raises major challenges in network inference whose resolutions require fundamentally new strategies for algorithmic and hardware design.
In this project we used the Field Programmable Gate Array hardware platform to realize a much more efficient implementation of the structure learning algorithm. The strategy we pursue is to use reconfigurable chips for the basic unit and to then array them to achieve the necessary throughput. The goal of mapping an algorithm into a spatial computational structure is to minimize the need for memory and centralized control and maximize the number of distributed parallel computational units. The use of FPGA is particularly attractive for testing out the new computation models that are needed for network inference, as we can change the customized hardware design anytime in the process of our algorithm development, simply by reprogramming the FPGA.
To test our design we used the real data sets (for a 11-node network) from Garry Nolan's lab and implemented our design on Berkeley Emulation Engine 2 (BEE2) FPGA system. The result is extremely promising: By exploiting the parallelism available on the FPGA system as well as logic optimization, design of specialized circuitry, and using the distributed on-chip memory system we achieved a speedup factor grater than 1000 compared to our software implementation on a 16-node Apple Xserve cluster.
Thursday May 24th, 2007 at 4 pm (Clark Center S362)
Xiling Shen, Stanford University Electrical Engineering
Hybrid modeling and robustness verification of the Caulobacter cell cycle regulation
Modeling approaches and tools can enable increased understanding of
complex systems. Electrical engineers/computer scientists have
created a large number of models and tools to add them in the
understanding of the electronic systems they create. In this talk, we
look at whether any of these concepts have utility in understanding
biological systems. We model Caulobacter Crescentus, which has been a
model system for studying cell cycle regulation. Its core cell cycle
engine consisting of a cyclic cascade of master regulator proteins
senses and in turn drives the progressive cell functions such as
chromosome replication and cytokinesis, forming various interlocked
feedback loops.
Using a hybrid modeling approach (mixed-mode simulation in EE speak)
which combines continuous ODEs with discrete event-driven logics, we
are able to simulate the regulatory network in conjunction with the
cell functions to provide a holistic picture of the cell cycle, This
model matches all experimental measurements, including known
mutants. . By further abstracting the hybrid model into an equivalent
electrical circuit composed of gates and state machines, we are able
to use another EE tool to systematically test the robustness of this
control network beyond typical experimental conditions in order to
identify critical parameter constraints. The results shed biological
insights into the architecture of the regulatory network as well as
clues for discovering hitherto unidentified regulatory pathways.
Thursday April 26th, 2007 at 4 pm (Clark Center S362)
Paul Spellman, Ph.D., Lawrence Berkeley National Laboratories
Modeling Cancer Signaling in a Panel of Breast Cancer Cell Lines
Breast cancers are heterogeneous in nearly every aspect. We have catalogued changes in genome copy number and gene expression for a diverse collection of breast cancer and breast cancer cell lines. We show that the aberrations in breast cancers are recapitulated in breast cancer cell lines. We have attempted to model several aspects of breast cancer cell lines to predict responses of corresponding tumors. We used the Pathway Logic modeling system to generate signaling network models for the cell lines Pathway Logic is designed to build discrete, logical models of biological systems. Logical models are fundamentally related to schematic diagrams that show relationships among genes, proteins, and other cellular components, and are easily interpretable in biological context. These models are therefore well designed for making predictions about key signaling events. We used mRNA and protein abundance data to populate a unique network model for each cell line. Genes or proteins that are differentially expressed across the cell lines were considered present in some cell lines and absent in others. With this approach, we were able to model the diversity of signaling across the panel of cell lines, as well as the features common to them. We found that the initial states of our models showed only a small amount of variation: of 286 components, only 13% varied across the cell lines. Importantly, even with this small amount of variation in the initial state, we were able to capture many of the key features used to classify these cell lines. Furthermore, the network models are vastly different: of the reactions predicted to be activated, over half of them varied across the cell lines. We used these network models to generate (and verify) two biological predictions about signaling in our cell lines. Our first prediction was that cell lines that express CAV1 and Integrin B1 have an alternate route for the activation of the MAPK cascade. Second, we have identified PAK1 as a key regulator of the RAF-MEK-ERK pathway in our cell lines. We have also examined the responses of the breast cancer cell lines to a substantial number of FDA approved therapeutic agents. This has allowed us to identify predictive signatures of response that, at least in one case, have been externally validated on tumor response data. Thursday March 15th, 2007 at 4 pm (Clark Center S360)
Markus Covert, Ph.D., Assistant Professor of Bioengineering, Stanford University
Title: Dissecting NF-kappaB control: a systems biology approach
Abstract: NF-kappaB is an important transcription factor in inflammation and the immune response, and it has become clear that determining the dynamics of NF-kappaB activity will be vital to understanding certain types of cancer. We implemented an integrated computational/ experimental approach to study the dynamics of the NF-kappaB signaling network under LPS stimulation. First, we generated quantitative data to develop a computational model of the NF-kappaB response to LPS. Using this model, we were able to predict the presence of a previously unknown factor in the LPS signaling pathway which we then verified experimentally. We have recently expanded this study using a system that enables us to quantify near-endogenous NF-kappaB activation dynamics in single primary cells. Wednesday February 21st, 2007 at 12 pm (Clark Center S362)
Jonathan Irish, Ph.D., Stanford University
Single Cell Profiles of Altered Tumor and Host Cell Signaling in Lymphoma
Differences in cancer cell signaling govern outcomes as diverse as proliferation and cell death and can drive the malignant behavior of cancer cells. Using flow cytometry, it is now possible to track and analyze signaling events in individual cancer cells. Data from this type of analysis can be used to create a network map of signaling in each cell and to link specific signaling profiles with clinical outcomes. This form of 'single-cell proteomics' can identify pathways that are activated in therapy-resistant cells and can provide biomarkers for cancer diagnosis and for determining patient prognosis. We recently applied this signaling profiles approach to study B cell receptor (BCR) signaling in primary human B cells. We compared follicular lymphoma (FL) B cells and non-malignant host B cells within individual patient biopsies and identified BCR-mediated signaling events specific to lymphoma B cells. BCR-mediated signaling occurred more rapidly in tumor B cells from FL samples than in infiltrating nontumor B cells, achieved greater levels of per-cell signaling, and sustained this level of signaling for hours longer than nontumor B cells. The timing and magnitude of BCR-mediated signaling in nontumor B cells within an FL sample instead resembled that observed in mature B cells from the peripheral blood of healthy subjects. BCR signaling pathways that are potentiated specifically in lymphoma cells should provide new targets for therapeutic attention. Wednesday January 17th 2007 at 4 pm (Clark Center S362)
Gal Chechik, PhD, Computer Science, Stanford University
Filling missing components in yeast metabolic pathways using functional motifs and heterogeneous data
The set of cellular metabolic reactions forms a complex network of interactions, but even in well studied organisms the resulting pathways contain many unidentified enzymes. We study how 'wiring' relations between genes in the yeast metabolic pathway are manifested in functional properties of genes and their products, including mRNA expression, protein domain content and cellular localizations. We develop compact and interpretable probabilistic models for representing protein-domain co-occurrences and gene expression time courses. The former can provide predictions relating domains and gene functions. The latter reveals relations between the activation of genes and the usage of their protein products in the pathways. These models are then combined and used for completing unidentified enzymes in the pathways, achieving accuracy that is significantly superior to existing state-of-the-art approaches. Thursday December 7th at 4 pm (Clark Center S362)
Gill Bejerano, PhD; UC Santa Cruz; Incoming Assistant Professor of Developmental Biology and Computer Science, Stanford University
Ultraconservation, Living Molecular Fossils, and Human Gene Regulation
The recent sequencing of the human genome and multiple related genomes, has uncovered an unexpectedly complex picture of genes, and the genomic regions that govern their expression. I will present a current view of human gene regulation, and some open questions it poses with respect to development, and related processes such as cancer. I will also discuss the discovery of ultraconserved elements, arguably the most puzzling regions in the human genome, and show a surprising origins for these and related genomic elements - whose functions, evolution and contribution to human disease we are only beginning to explore. Monday October 2nd 2006 at 2 pm (Clark Center S360)
Carlo C. Maley, Ph.D., The Wistar Institute
The evolution of clones in neoplasms drives both progression to malignancy and
therapeutic resistance. We study evolution in neoplasms with an integrative
cancer biology approach. Our work consists of three mutually reinforcing
approaches: the application of evolutionary theory to the analysis of genetic
data from neoplasms, computational simulations of clonal evolution to generate
hypotheses, and evolutionary experiments in tissue culture. I will briefly
present results from each of these approaches. We applied evolutionary and
ecological theory to samples of pre-malignant Barrett's esophagus and found
that measures of genetic diversity predicted progression to malignancy. We have
used simulations of the evolution of therapeutic resistance as a platform to
develop hypotheses for how resistance might be avoided or suppressed. This led
us to the theory of benign cell boosters (increase the fitness of benign cells)
and the Sucker's Gambit (increase the fitness of chemosensitive cells), which
we are now testing in tissue culture. I will present some initial experiments
looking at the effects of acid and bile salts on Barrett's and normal
esophageal squamous cell lines in monoculture and in competition.
Monday September 18th 2006 at 2 pm (Clark Center S361)
Dr. Byron Ellis, Stanford Molecular Pharmacology
The Bayesian Network provides an appealing link between the qualitative
description of complex biological networks, such as signal transduction
pathways, and statistical frameworks employed in the analysis of high
throughput experiments, though to use these models effectively we must be
able to generate good posterior samples from the distribution on Bayesian
Networks, a traditionally difficult problem. To address this I will present
a simple extension to the Order MCMC approach, some details of a fast
implementation and some assessment of performance with applications to
multiparameter flow cytometry data.
Thursday August 24th 2006 at 2pm (Clark Center S360)
Mitchell Guttman, Center for Bioinformatics, University of Pennsylvania
A Computational Approach for Determining Conserved Aberration Patterns
in Cancer: A Case Study using Lymphoma AberrConserved regions of
aberration in a cancer type can be important in turning on various
pathways necessary for cancer development and survival. Furthermore,
various patterns of aberrations seem to define specific cancer types
and subtypes. Therefore, determining regions of conserved aberration
within a class of samples is important to accurately profile a cancer
type.
To address this issue we will present a method for assessing the significance of concordant genomic aberrations across multiple comparative genomic hybridizations (aCGH). Additionally, we are interested in identifying regions that drive, distinguish, and characterize groups within the data. To this end we describe novel computational approaches to addressing these issues including supervised and un-supervised approaches that yield interesting group patterns when most available methods fail. These computational approaches will be specifically illustrated on a lymphoma dataset generated at Stanford University. The results suggest highly conserved aberrations present on most chromosomes. Traditional unsupervised clustering as well as various supervised approaches failed to detect distinctions between high-grade lymphoma and low-grade lymphoma. Using our analytical methods, from preprocessing to analysis and subsequent supervised analysis, we were able to detect a specific region on chromosome 6 that distinguishes high-grade and low-grade lymphoma. |