This page covers some research questions that would be interesting to explore, once we have an initial framework to play with
- Can we better predict which optimizations to use if we take dynamic features of the application?
- What kinds of dynamic features can we capture?
- GCC/LLVM both have profile guided optimization, can we use this file for the dynamic features?
- Can hardware counters be used?
- Some optimizations have parameters which aren't on or off. Can be learn good values for these parameters?
- What is the effect of applying N learnt optimizations, and then retaking the features?
- Which features should be in the feature vector?
Principle Component Analysis has been used to evaluate how much each feature varies across BEEBS. See blog post here: http://mageec.org/2014/08/11/evaluation-of-static-program-features-used-in-mageec/#more-408
- Are the features compiler specific?
- What types of machine learning performs best for learning optimizations?
- Can the machine learning learn when to 'backtrack' and undo a previously applied optimization, based on benchmark features?
- How varied a set of benchmarks is needed to properly train a database?
- Can fewer benchmarks be used, but each benchmark altered by applying a random set of transformations?
- Do different sequences of optimizations need to be applied for different data sets?
Multidimensional Cost Functions
- Can we optimize for energy and performance simultaneously?
- Can the balance between different cost metrics be altered?
- Does the database need to be retrained for different target metrics?