CancerDiscover

Open-source software pipeline for cancer classification from high-throughput data using machine learning.

View the Project on GitHub HelikarLab/CancerDiscover


CancerDiscover

A data mining suite for cancer classification

CancerDiscover is an open source command line pipeline tool (released under the GNU General Public License v3) that allow users to efficiently and automatically process large high-throughput datasets by converting data (for example CEL files, etc.), normalizing, and selecting best performing features from multiple feature selection algorithms. The pipeline lets users apply different feature thresholds and various learning algorithms to generate multiple prediction models that distinguish different types and subtypes of cancer.

Cite: If you use our tool, please cite Mohammed, A., Biegert, G., Adamec, J., & Helikar, T. (2018). CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget, 9(2), 2565–2573. (https://doi.org/10.18632/oncotarget.23511)

Note: CancerDiscover is an open-source software, in case if you run across bugs or errors, raise an issue over here.

Table of Contents

This README file will serve as a guide for using this software tool. We suggest reading through the document, in order to get an idea of the options available, and how to customize the pipeline to fit your needs.

System Requirements

You will need current or very recent generations of your operating system: Linux OS, Mac OSX.

Downloading CancerDiscover and Dependencies

curl -sL bit.do/installation_linux | sh
curl -sL bit.do/installation_mac | sh

To install CancerDiscover dependencies right from scratch, check out our exhaustive guides:

Directory Structure of the Pipeline

Execution of Pipeline

Contribution

Dr. Akram Mohammed akrammohd@gmail.com

Dr. Tomas Helikar (PI) thelikar2@unl.edu

Dr. Jiri Adamec jadamec2@unl.edu

Greyson Biegert greyson@huskers.unl.edu

License

This software has been released under the GNU General Public License v3.