February 6, 2016. Article accepted for publication on cytogenetic image analysis using machine learning

Yanxin Li1, Joan H. Knoll2,3, Ruth Wilkins4, Farrah N. Flegal5, and Peter K. Rogan1,3*    Automated Discrimination of Dicentric and Monocentric Chromosomes by Machine Learning-based Image Processing. Departments of 1Biochemistry, and 2Pathology and Laboratory Medicine, Schulich School of Medicine and Dentistry, University of Western Ontario, 3Cytognomix Inc., 4Health Canada, and 5Canadian Nuclear Laboratories.

in the journal Microscopy Research and Technique.

Abstract:  Dose from radiation exposure can be estimated from dicentric chromosome (DC) frequencies in metaphase cells of peripheral blood lymphocytes.  We automated DC detection by extracting features in Giemsa-stained metaphase chromosome images and classifying objects by machine learning (ML).  DC detection involves i) intensity thresholded segmentation of metaphase objects, ii) chromosome separation by watershed transformation and elimination of inseparable chromosome clusters, fragments and staining debris using a morphological decision tree filter, iii) determination of chromosome width and centreline, iv) derivation of centromere candidates and v) distinction of DCs from monocentric chromosomes (MC) by ML. Centromere candidates are inferred from 14 image features input to a Support Vector Machine (SVM). 16 features derived from these candidates are then supplied to a Boosting classifier and a second SVM which determines whether a chromosome is either a DC or MC. The SVM was trained with 292 DCs and 3135 MCs, and then tested with cells exposed to either low (1 Gy) or high (2-4 Gy) radiation dose.  Results were then compared with those of 3 experts. True positive rates (TPR) and positive predictive values (PPV) were determined for the tuning parameter, sigma. At larger sigma,  PPV decreases and TPR increases.  At high dose, for sigma= 1.3, TPR = 0.52 and PPV = 0.83, while at sigma= 1.6, the TPR = 0.65 and PPV = 0.72.  At low dose and sigma = 1.3, TPR = 0.67 and PPV = 0.26. The algorithm differentiates DCs from MCs, overlapped chromosomes and other objects with acceptable accuracy over a wide range of radiation exposures.

A preprint of the paper is available at bioRxiv: http://biorxiv.org/content/early/2016/01/19/037309

January 17, 2016. MutationForecaster Workflow Updates.

New in MutationForecaster®: Improved, more comprehensive Workflows!

MutationForecaster now generates comprehensive genome interpretation on-the-fly. The results from all of our gene variant interpretation modules (Shannon Splicing Mutation Pipeline, ASSEDA, VEP, and Veridical) can now be automatically processed by CytoVA to find mutated genes in the genome related to a particular phenotypes based on published literature. Results are also be immediately processed to find dysfunctional biochemical pathways common to multiple mutated genes. All of the results are directly imported to your own CUVD repository, where all the results for each variant are grouped together.

The process is completely unattended. Start the Workflow for an variant set from an exome or genome sequence; several hours later all of the analyses are finished for you to review in your own CUVD database.


December 21, 2015. New capability in Cytognomix User Variation Database (CUVD)

Every gene variant imported into CUVD from our other genome interpretation modules can be searched in several external databases seamlessly. Currently, all LOVD locus specific databases, dbSNP, ClinVar, and the Exome Variant Server are searched together and  variants found in any of these resources  are added to CUVD and hyperlinked when the search is completed. Until today, only one variant at a time could be searched.

As of today, CUVD is now able to simultaneously search and retrieve these data from batches of multiple variants with a single request (see below). Select all or just a group of variants in your database.  MutationForecaster® estimates how long the search will take and notifies you when the task is complete. For example, searching 20 variants takes just over 1 minute. Replace outdated results when the databases are updated simply by repeating the search.  Sign up for your free trial of MutationForecaster and try this exciting feature yourself!


December 17, 2015. Final version of machine learning-based chemotherapy response article is online

The final version of our paper:

Dorman S, Baranova K, Knoll J, Urquhart, B, Marciani G, Carcangiu M-L, and Rogan PK.  Genomic signatures for Paclitaxel and Gemcitabine resistance in breast cancer derived by machine learning. Mol. Oncology 10: 85-100, 2015. doi: 10.1016/j.molonc.2015.07.006

is available in print and here:  Dorman etal Mol. Onc.10:85-100, 2016

December 14, 2015. Try MutationForecaster® for two weeks: Free of Charge!

We are excited to be able to offer our customers and registrants this opportunity to experience our integrated suite of genome interpretation products. For the first time, Cytognomix is offering a free trial of our MutationForecaster® genome interpretation suite to all registrants of the product. No subscription is required to analyze data with any of our software tools.  Trial users are provided with the same datasets that we have analyzed in our peer-reviewed publications. Start your trial whenever you’re ready.

The trial showcases many capabilities available to subscribers:

  1. Run all of our major software products with their built-in filters:
  • Automated Splice Site and Exon Definition Analysis (ASSEDA)
  • Shannon Splicing Mutation Pipeline
  • Veridical
  • Variant Effect Predictor (VEP)
  • Cytognomics Visualization Analytics for literature and genomic validation (CytoVA)
  • Cytognomix User Variation Database (CUVD)

2.  Customize results with any of these products:

  • Alter parameters and change information models in ASSEDA
  • Custom filtering of results obtained from the Shannon pipeline, Veridical, and VEP
  • Run literature or cytogenomic-based queries of Medline with CytoVA
  • Export results to CUVD, which you can search, modify, or analyze variants with a wide variety of external databases then archive results of searches
  • Download results from any product

3.  Streamline analysis of a dataset with all of these products in a single run using Workflows

Once you see the discoveries that only MutationForecaster® can make, we are confident that you will sign up for a subscription to analyze your own data.

Contact us if you have questions about the trial.

Happy holidays!

November 26, 2015. Splicing Mutation Calculator software

The Splicing Mutation Calculator web software described in:

Caminsky NG, Mucaki EJ and Rogan PK. Interpretation of mRNA splicing mutations in genetic disease: review of the literature and guidelines for information-theoretical analysis [version 2; referees: 2 approved] F1000Research 2015, 3:282 (doi: 10.12688/f1000research.5654.2)

has been migrated to the MutationForecaster system (http://mutationforecaster.com).   Subscribers to MutationForecaster have unlimited access to this product.

The one year free trial to this commercially-developed software has ended.  The original  website has been deprecated and no longer provides this functionality.


November 21, 2015. Literature based filtering in the MutationForecaster system

In next generation sequencing, exomes in particular, the challenge is to find relevant pathogenic gene variants among a sea of superfluous sequence changes. But the track record for filtering the most likely causative changes is dismal (20-25%). Most filtering methods remove common variants but do little else. Cytognomix has developed CytoVA, software that relates variants to patient peer-reviewed phenotypes in real time. We are adding this to our MutationForecaster system. Check it out!

Upcoming Presentation at University of Windsor, Ontario, Canada.

Peter Rogan will present:

“Genomic analysis of metastasis and tumor chemotherapy response based on information theory and machine learning”

Department of Computer Science

University of Windsor

Date:  Friday, November 13th, 2015
Time: 11:00 am
Location: Chrysler Hall – G100

 Abstract: The integrated analyses of cancer phenotypes with complex genomic datasets has resulted in many new insights into diagnosis and prognosis. However, there is no single correct way to analyze these data, and the data themselves can vary significantly  in content and interpretation between different studies of the same tumor type.   We have used mutation, expression and copy number data to study breast cancer genes and genomes (hereditary and somatic). A major challenge in inherited breast cancer is the missing heritability; pathogenic mutations are not detected despite strong family historie. Our approach has been to prioritize functionally significant variants using information theory-based models of DNA and RNA binding protein binding sites.  These same approaches – when applied to breast tumour exome sequences – have revealed numerous missed mRNA splicing mutations, and identified mutated pathways, validated by RNA sequencing, that are overrepresented in these tumour genomes. Application of biochemically-inspired machine learning to these integrated genomic data from cell lines produces gene signatures that robustly predict therapeutic response that we have validated with patient tumor data. Machine learning is a promising general approach that can be used for other drugs and tumor types with good recall.

Presentation. 2015 Canadian Cancer Research Conference

Peter Rogan will be presenting:

Seeking the “Missing Heritability” in High-Risk Hereditary Breast and Ovarian Cancer (HBOC) Patients By Prioritizing Coding and Non-Coding Variants in 21 Genes.  Natasha Caminsky G, Eliseos Mucaki J,  Amelia Perri M, Ruipeng Lu, Matthew Halvorsen, Alain Laederach, Joan Knoll HM, Peter Rogan K

on Tuesday, November 10 from 12-2 PM in the poster session: Genomics, Proteomics, and Bioinformatics

in Montréal – Hôtel Bonaventure.

Scientific Program: link


Current BRCA1 and BRCA2 genetic testing for hereditary breast and ovarian cancer (HBOC) is often uninformative. The “missing heritability” may be due to variants in uninvestigated regions of these genes or variants in other genes. We have applied a unified framework based on information theory (IT) to predict and prioritize non-coding variants of uncertain significance. We captured complete gene sequences of 21 diseaserelevant genes in HBOC patients with uninformative hereditary predisposition testing (N=336) by hybridization enrichment using ab initio single copy probes that comprehensively span non-coding regions and flanking sequences of ATM, ATP8B1, BARD1, BRCA1, BRCA2, CDH1, CHEK2, EPCAM, MLH1, MRE11A, MSH2, MSH6, MUTYH, NBN, PALB2, PMS2, PTEN, RAD51B, STK11, TP53, and XRCC2. We identified 38,538 unique variants. Eight were likely pathogenic BRCA1/2 mutations previously undetected by clinical testing. Eight proteintruncating mutations were identified in non-BRCA genes, the majority of which were in PALB2 (N=5), and 148 missense variants were flagged. Information weight matrices were derived for transcription factor (TFBS), splicing regulatory (SRBS), and RNA-binding (RBBS) protein binding sites from high-throughput sequencing data. IT analysis prioritized 12 variants affecting splicing (6 natural, 6 cryptic), 71 TFBS, 218 SRBS, and 29 RBBS. Co-segregation analysis found the relative risk of breast cancer for likely pathogenic BRCA variants torange from 1.55 to 75.78. According to clinically accepted guidelines, twenty-three were possibly pathogenic (13 confirmed by Sanger sequencing to date), 472 were of uncertain significance, and all remaining were likely not pathogenic. Complete gene analysis of BRCA1/2 and other genes is a successful strategy for identifying probable mutations in previously uninformative HBOC patients.

October 6, 2015. Presentations at the 2015 International EPR Biodose meeting

Drs. Joan Knoll and Peter Rogan gave platform presentations about the underlying algorithms and application of the Automated Dicentric Chromosome Identifier and Radiation Dose Estimator:

Radiation dose estimation by automated chromosome biodosimetry”  and

Automated Discrimination of Dicentric and Monocentric Chromosomes by Machine Learning-based Image Processing

at the EPRBiodose meeting at Dartmouth College, organized by the International Association of Biological and EPR Radiation Dosimetry .

Sept. 18, 2015. Press release about chemotherapy resistance paper

Western University hopes to use artificial intelligence to improve breast cancer patient outcomes.

(http://mediarelations.uwo.ca/2015/09/18/researchers-at-western-university-hope-to-use-artificial-intelligence-to-improve-breast-cancer-patient-outcomes/, other links at end of post)

Western University researchers are working on a way to use artificial intelligence to predict a patient’s response to two common chemotherapy medications used to treat breast cancer – paclitaxel and gemcitabine.

Peter Rogan, PhD, and a team of researchers, including Stephanie Dorman, PhD, and Katherina Baranova, BMSc, at Western’s Schulich School of Medicine & Dentistry, are hoping to one day remove the guesswork from breast cancer treatment with this technique.

Based on personal genetic analysis of their tumours, patients with the same type of cancer can have different responses to the same medication. While some patients will respond well and go into remission, others will develop a resistance to the medication.

Identifying the genetic factors which lead to resistance or remission can help develop better targeted, individualized treatment regimens with better patient outcomes.

“Treating patients with therapies that are the most likely to be successful can help reduce unnecessary toxicity and improve overall outcomes,” said Dorman.

Rogan and Joan Knoll, PhD, professor, Schulich Medicine & Dentistry, began by defining a stable set of genes in 90 per cent of breast cancer tumours in 2012.

Beginning with 40 genes including several stable genes, the team then used artificial intelligence combined with data from cell lines and tumour tissue from cancer patients who had treatment with at least one of the medications to narrow down and identify the genetic signatures most important for determining resistance and remission for each medication. Their­ study has recently been published in the journal, Molecular Oncology.

Using the data, the researchers were able to identify the 84 per cent of women with breast cancer who would go into remission in response to the drug paclitaxel. The genetic signature identified for the drug gemcitabine was able to predict remission using preserved tumour tissue with 62 to 71 per cent accuracy.

Now, with this data in hand, the researchers are working to further refine the genetic signatures and improve the predictions further.

“Artificial intelligence is a powerful tool for predicting drug outcomes because it looks at the sum of all the interacting genes,” said Rogan, professor in the departments of Biochemistry, Oncology and Computer Science, Canada Research Chair in Genome Bioinformatics and president, Cytognomix Inc. “If we can use this technology to improve our knowledge of which medications to use, it could improve patient outcomes. The earlier we treat a patient with the most effective medication, the more likely we can effectively treat or possibly even cure that patient.”


Reference: Dorman SN, Baranova K, Knoll JH, Urquhart BL, Mariani G, Carcangiu ML, Rogan PK. Genomic signatures for paclitaxel and gemcitabine resistance in breast cancer derived by machine learning. Mol Oncol. 2015 Aug 22. pii: S1574-7891(15)00146-5. doi: 10.1016/j.molonc.2015.07.006. [Epub ahead of print] http://www.moloncol.org/article/S1574-7891%2815%2900146-5/fulltext

MEDIA CONTACT: Tristan Joseph, Media Relations Officer, Schulich School of Medicine & Dentistry, Western University, 519-661-2111 ext. 80387, c: 519-777-1573, tristan.joseph@schulich.uwo.ca


Western delivers an academic experience second to none. Since 1878, The Western Experience has combined academic excellence with life-long opportunities for intellectual, social and cultural growth in order to better serve our communities. Our research excellence expands knowledge and drives discovery with real-world application. Western attracts individuals with a broad worldview, seeking to study, influence and lead in the international community.


The Schulich School of Medicine & Dentistry at Western University is one of Canada’s preeminent medical and dental schools. Established in 1881, it was one of the founding schools of Western University and is known for being the birthplace of family medicine in Canada. For more than 130 years, the School has demonstrated a commitment to academic excellence and a passion for scientific discovery.

Follow Western Media Relations online:

Website: http://communications.uwo.ca/media/
RSS: http://feeds.feedburner.com/MediaWesternU
Twitter: https://twitter.com/mediawesternu

Links to story:










































September 11, 2015. Final version of paclitaxel and gemcitabine chemotherapy signature paper now published


Genomic signatures for paclitaxel and gemcitabine resistance in breast cancer derived by machine learning

Stephanie N. Dorman, Katherina Baranova, Joan H.M. Knoll, Brad L. Urquhart, Gabriella Mariani, Maria Luisa Carcangiu, Peter K. Rogan
Received: July 20, 2015; Accepted: July 31, 2015; Published Online: August 21, 2015
Publication stage: In Press, Corrected Proof