Difference between revisions of "Deliverable 5.2"

From MAGEEC
Jump to: navigation, search
(Created page with "Category:Deliverables =Deliverable 5.2: Selection of Core Machine Learning Algorithms= ==Status: Ongoing, options identified== Via experimentaion, primarily using the W...")
 
m
Line 5: Line 5:
 
==Status: Ongoing, options identified==
 
==Status: Ongoing, options identified==
  
Via experimentaion, primarily using the WEKA framework, along with discussions from the MILEPOST team, we have already identified the following as possible core machine learning algorithms:
+
Via experimentation, primarily using the WEKA framework, along with discussions from the MILEPOST team, we have already identified the following as possible core machine learning algorithms:
  
* Bullet points
+
* Decision Tree J48
* for Moon to fill in
+
* KNN
 +
* SVM
 +
 
 +
In terms of validation methods, the following are good candidates:
 +
 
 +
* 10 Cross Fold Validation
 +
* Leave One Out Validation
  
 
Of these, the following have been discarded as unsuitable:
 
Of these, the following have been discarded as unsuitable:
  
 +
* SVM - Will take too long to train on the data
 +
* Leave One Out Validation
 +
It has been researched that LOOV is not quite as optimal as 10 Cross Fold, and there will be added benefits of 10 Cross Fold being faster in this case. Both models are approximately unbiased, with 10 fold having slightly less variance which is preferred. (Efron, 1983)
  
  
This leaves a choice between X, Y & Z to be made by the end of September. We will do this taking into account ...
+
This leaves a choice between J48, KNN & SVM to be made by the end of September. We will do this taking into account what our training data will be, how long these algorithms will take to train as well as practicality, performance and reliability. Currently research is tending towards J48 due to its simplicity, ease to translate to C code and very fast creation and evaluation.

Revision as of 15:05, 30 August 2013


Deliverable 5.2: Selection of Core Machine Learning Algorithms

Status: Ongoing, options identified

Via experimentation, primarily using the WEKA framework, along with discussions from the MILEPOST team, we have already identified the following as possible core machine learning algorithms:

  • Decision Tree J48
  • KNN
  • SVM

In terms of validation methods, the following are good candidates:

  • 10 Cross Fold Validation
  • Leave One Out Validation

Of these, the following have been discarded as unsuitable:

  • SVM - Will take too long to train on the data
  • Leave One Out Validation

It has been researched that LOOV is not quite as optimal as 10 Cross Fold, and there will be added benefits of 10 Cross Fold being faster in this case. Both models are approximately unbiased, with 10 fold having slightly less variance which is preferred. (Efron, 1983)


This leaves a choice between J48, KNN & SVM to be made by the end of September. We will do this taking into account what our training data will be, how long these algorithms will take to train as well as practicality, performance and reliability. Currently research is tending towards J48 due to its simplicity, ease to translate to C code and very fast creation and evaluation.