Press release Q&A

Our technique

The computer science technique we used is called "pattern recognition", which is already widely used in daily life from recognizing faces in camera, to verifying identities with fingerprints at immigration.

When applied to pathology

This technology can recognize cancer cells in pathological specimen. It uses the shape of cell nuclei to decide what type of cell it is: Large, light color cells are cancer cells; long and thin cells are stroma; small, round and dark cells are lymphocytes.

Why do we need this instead of traditional pathology?

Pathologists have analyzed tumor tissue for centuries accumulating large amount of knowledge. We don't want to replace pathologists; instead, we aim to provide pathologists with accurate, objective data, just like computers can't replace astrophysicists but can provide them with enormous quantities of data.

Image via ircamera

How does it benefit patients?

Medical research is now gearing towards multi-center, large-scale studies with hundreds or thousands patients. It is literally impossible for a single pathologist to review all these cases, but different pathologists may have different opinions on a case. Computers remove this manual bias and provide objective data so that many samples can be fairly compared. This is particularly important for therapeutic purposes, e.g. when looking for an indicator of suitable treatments for a patient, we can't afford any biases.

How long does it take?

With high performance computing, hundreds of cases can be processed in one night, where tens of thousands of cells are recognized in a case. A pathologist can only count up to 5,000 cells a day. Therefore, automated, modern techniques can relieve pathologists from tedious counting.

What is new here in this particular study?

We are among the first to quantitatively combine phenotypical pathology image patterns with molecular level data, a.k.a. genomics. Previous approaches have either come from pathology or genomics, but we are first to bridge the gap between these 'two cultures'. Understanding a tumor is just like understanding a car - we not only need to look from outside to see the design and how it behaves, but also should look under the bonnet to find out the underlying mechanisms. By combining both 'outside' and 'inside' as a brand new interdisciplinary approach, we expect a huge potential for generating a comprehensive portrait of breast cancer.

What aspect of your findings surprised you the most?

Our technique allows investigation into the spatial arrangement of a tumor, as opposed to genomics, where all these information are averaged. We were surprised to find that by using a statistical method from ecology, which is often used to describe animal behavior - whether they are clustering together or scattered around the landscape, we were able to describe cell spatial behavior in a tumor section, and used that as an indicator for prognosis. This is a very nice synergy between ecology and cancer biology.

What is next?

Our approaches help to better characterize breast cancer by combining pathology with genomics. In the future, we hope to extend this line to different cancer types to help defeat cancer.

Project

Quantifying pathology for genomics
Joint corresponding work with the Markowetz Lab
Science Translational Medicine, 4, 157ra142 (2012).

We connected the fields of computer science and cancer research by training computer programs to automatically recognize cancer cells in tumor samples and derive useful findings together with molecular signals.

For the geeks: We provide image analysis tools for automatic segmentation and classification of various cell types in pathological H&E images. We showed that quantitative cellular features can complement genomic and transcriptomic data to construct powerful prognosticators in two independent cohorts of 323 and 241 breast cancer patients.

Software

R

CRImage provides image analysis tools for segmentation, classification, and downstream analysis of H&E images. One application is for cellularity scoring of tumours by counting the number of cancer cells and other cells. The package also comes with a novel algorithm for copy-number data correction for SNP microarray data using estimates of tumours cellularity from the image analysis.

Download

BioConductor

Download statistics

Download stats

Latest feature

Interactive session for cell classification provides an interactive interface for users to classify cells into different categories by clicking on the images directly.