Capstone Project Report

on

I put my project report on Github. You can see my notes about the report and you can find a presentation embed in this post.

Capstone Project

This is my capstone project report from 2015. I did not edit the content of the report, I only removed my personal information such as student ID, address and mobile phone number and added watermark for license. Therefore, I hope you would be fair and consider that it is my first academic paper, while you are reading it.

As I mentioned before I only changed the Curriculum Vitae page (page number 26) to keep my private information out of the Internet and I removed my student ID from the first page of the report.

There are also results on CSV format for different distance methods and source code of the voting_mrmr function with a demo dataset. I uploaded only the most important function as source code, please read its README file or Code and Demo section below, before running the code.

Also my 3-5 minutes presentation is available on this page. However there is no transcript for presentation.

If you use all or any part of the report file, please put my name and a link to this github page on your paper as attribution.

Code and Demo

There are two files under this section, one is Matlab codes and the other one is a set of data for quick demo. Before take a look at them please read below notes.

About Code File

I only put the codes of the most important part of the project. Of course, I wrote and used other codes to get results and to process them. Also I wrote scripts to apply voting_mrmr for different parameters. However, I thought that it is enough to put main part of the project, that is how it can be tested by others.

For more information, please read the project report.

voting_mrmr.m

It requires Mutual information toolbox and Matlab version of mRMR by H. Peng, which can be found here. It is not going to work, if requirements are not added to working path correctly.

About Demo File

The demo data (demo.mat) is created for easy demonstration, base on the dataset that is told in project report (Discretized NCI data (9 cancers, discretized as 3-states)) and it is available at here.

Please note that, I am not the owner of the dataset and it might be removed if the owner asks for it. It might has own license, so please do not share without visiting source website.

For more information, please visit this page.

There are subsets of the real dataset.

  • S: Sample (20x9712)
  • T: Train (40x9712)
  • gS: Group of Sample (20x1)
  • gT: Group of Train (40x1)
capstone, report, final, machine learning, mrmr, knn, feature selection, classification
Found a mistake? Well, let's edit together!
comments powered by Disqus