Your browser has javascript turned off or blocked. This will lead to some parts of our website to not work properly or at all. Turn on javascript for best performance.

The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here:

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Picture of Patrik Edén

Patrik Edén

Senior Lecturer

Picture of Patrik Edén

Classification and Computational Methods in Gene Expression Data Analysis


  • Cecilia Ritz

Summary, in English

The technology of cDNA microarrays has given us the possibility to monitor the state of cells by measuring the activity of thousands of genes simultaneously. This high-throughput techniqe has in cancer research allowed exploratory studies of molecular mechanisms behind for example metastasis and response to therapy. This increased knowledge can hopefully result in new therapies and improved prognostic and predictive tools. These tools however have to be properly validated in large cohorts and must be subjected to large-scale trials before use in the clinic.

One aim of this thesis is to evaluate the performance of classifiers of clinical outcome for breast cancer based on gene expression data as compared to conventional clinical markers. Additionally, we develop computational methods for analysis and classification using gene expression data. Our results suggests that clinical markers and molecular profiling have similar power in breast cancer prognosis. Further studies using larger cohorts are thus needed to validate and refine molecular prognostic profiles. We have also performed multicategory classification of leukemia into genetic subtypes and have predicted response to therapy in a subgroup. The main contribution to the computational analysis is our development of a method for improvement of missing value imputation of 2-dye cDNA microarray data. Recognizing that some categories of missing values are over- or underestimated in a kNN-based imputation method, we suggest a linear model that corrects for this bias and improves imputation of these spots.


  • Computational Biology and Biological Physics

Publishing year




Document type



Department of Theoretical Physics, Lund University


  • Biophysics


  • Bioinformatik
  • medicinsk informatik
  • Bioinformatics
  • medical informatics
  • biomathematics biometrics
  • missing values
  • leukemia
  • cDNA microarray data
  • supervised classification
  • breast cancer
  • prognostic markers
  • biomatematik




  • Patrik Edén


  • ISBN: 978-91-628-7159-8

Defence date

11 May 2007

Defence time


Defence place

Lecture Hall F, Dept. of Physics


  • Carlos Caldas (Professor)