Meeting at Lymington, Wed July 31st 2013
'Present: JB, KIE, MG, SC, JP, SH, OR'
Framework update (Simon Cook)
- Draft spec presented, working on top and bottom elements of the design plan.
- We're using a GCC plugin to avoid the need to manipulate the GCC code base. Our plugin will implement a new pass to perform the feature manipulation that we need.
- GCC plugin interface is working, and hooks are implemented
- GCC<->MAGEEC<->Learner interfaces for the plugin. Each can talk to the command line.
- Can be made extensible for future using the <code>prog_feat</code> struct, which represents a pair of <code>char *feat_name, void *feat_data</code>
- KIE: all our feature files should
- Need to fill in the code to implement the pass functionality
- MILEPOST used a similar technique.
- Q: C or C++ for the code base.
- A (JB): C++ aids in the structure and the migration of GCC from C to C++. Also LLVM is in C++.
- A class diagram needs to be created
- C++'s multiple inheritance to be deployed using abstract classes with an interface technique to give similar approach to inheritance to how Java works.
Machine Learning (Moon)
- Working on PCA analysis over 36 samples (WCET + 10 ported MiBench) and 55 (MILEPOST) features.
- To give us a set of features that cover the most variance in the data.
- WEKA machine learning toolkit
- (attribute,value) format
- Applies multiple off-the-shelf techniques
- Returns a list of weighted features to categorise the input data
- Does not tell us what the actual features are!
- Source code probably needs hacking to dump this information.
- Results from both WEKA and Matlab analysis show that only 3-4 features are needed to represent the data.
- First set covers 96% of the variance.
- Q: Need more tests
- A (SC): GCC regression test suite (100+) [gcc.c-torture], LLVM nightly test suite, coreutils? build essentials? Mibench, Hayden's applications.
- OR: In order to correctly deploy PCA to determine which feature vectors are useful and which could be discarded, we need to give it a very wide data set, since without them it will not be able to pick the most relevant vectors to apply to all data, and will optimise for only the presented types of programs. PCA determines patterns in the input set.
- Need to add large programs to our test set to make this work.
- Firefox, gcc, llvm, linux kernel, libreoffice, chromium
- Need to add large programs to our test set to make this work.
- KIE: Algorithmic determination. Once we have data set and features, we need to try multiple algorithms.
- Is MILEPOST strategy repeatable
- WEKA can deploy all useful propositional / non-representational techniques and collate results.
- See how well these perform.
- OR: [future work] Relational representation: progol and foil possible, progol the way forward.
- General methodlogy for that: Stick to a relational database representation for our work.
- Testing AVR compiler with 8 bit architecture with the BEEBS benchmarks.
- Blowfish can't run – size constraints
- Posted at a new github repository <code>lowpower-benchmarks</code> https://github.com/mageec/lowpower-benchmarks
- Small modifications needed e.g. random number generator port
- Good README file
- Results: timing has factors 100+ or so between benchmarks.
- JP: Different performance on ARM, which has everything within 2-3x.
- Any more candidates for set (e.g. ones identified but not ported last Summer) happily added by JB.
- All the benchmarks are now working and self verifying!
- Over the next couple of days I hope to have collected energy readings
about these benchmarks so that we can check that the hardware is functioning properly, and that we're using a sensible value on the
- I can also now test the energy measurement part of the test framework.
- JB: The self-verification code should not contribute significantly to the features that are extracted. Aim for lightweight. If not possible, use an exclusion mechanism for Simon to define.
- James presented progress on Energy + benchmark papers
- Energy paper through first round of reviews at Computer Journal
- Approach for benchmarks paper approved. Aim to complete in next 3 weeks.
- First MAGEEC paper on ML and data and results from Moon's work with an aim of first draft at the end of September. Target to be determined.
- Bin Tao's work includes an innovation to complement.
- HiPEAC in January is a potential target.
- 4th September for review of paper content.
Experiment run: code that straddles flash banks and runs on flash to see differences in energy
- James presented data from ARM Cortex M0 that showed energy vs flash memory access figures for loops with jumps straddling an offset.
- The pattern is very Z shaped with penalties (4-6% of energy) for not being 4byte aligned,
- Several features (e.g. transitions 128byte boundaries) that need to be explained.
- Lower address numbers have
- A model can be created based on experiments, and correlates well to measurements.
- With longer loops, with more <code>nops</code> in-line, lower energy changes are seen.
- Due to the same op-codes being put onto the data lines each time.
- TODO: experiments where the null sequences in the loop are different op-codes. Experiments on M3.
- Project related code will be GPL. Potential for later dual-BSD licensing.
- Other material licensing to be advised in consulation with Andew Back.
Community and website
- We are publishing data and databases (see new links on MAGEEC wiki and UoB wikis)
- We do want the wider community to contribute code/data where possible, so the strategy for development of these mechanisms need to be developed, along with appropriate licensing and copyright consideration.
- Github should link from MAGEEC.
- People are now visible and design files will be published on github.
- JP going to give an energy measurement workshop at OSHCAMP
- 2xi7 machines with 32GB RAM to be purchased.
- Following other equipment purchased for evaluation:
- 2x Raspberry Pi Model B
- 1x Pandaboard
- 5x shrimping Kit
- Chipkit MAX32
- 5x Shrimping it components
- Arduino Uno
- Arduino Leonardo
- Arduino Mini
- Sparkfun USB Whacker
- M5282LITEE Eval board
- WP3 completed.
- Design to be ported to github repository.
- WP4: 2 months left, work todo discussed above.
- WP5: Deliverables on track for September completion.
- WP9: Energy efficient SIG to add to activities. Papers discussed. Website and github up and running.
- Risks reviewed and for most determined OK so far, but still early in project.
- New risk: hardware measurement boards may need to re-spin (5, 2, 10); mitigation using V2 boards + scope to re-spin.
- AW: Blog post
- SC: C++ class diagram
- MG: Hack source of WEKA to dump the actual features that matter.
- SC&OR: Contact MILEPOST contributor Albert Cohen to determine PCA experience on MILEPOST
- KIE: Arrange Mike O'Boyle visit on 9th, 11th or 12th September.
- JB: introduction to Nigel Topham at ARC (MILEPOST)
- JB: Tag original benchmark set on github for reference.
- JB: Add licences to all files in repository.
- SC: Define an attribute for a function <code>__attribute__((mageec_ignore))</code> and sort out warnings and errors.
- JP/SC: Link the energy precursor project on the MAGEEC website
- JP: Further flash experiments as detailed above.
- AB: To advise on licensing (e.g. CC) license for data sets. Documentation similarly.
- SH: Photo of hardware and software running for MAGEEC homepage.
- SH: publish hardware design files on github
- SC: MAGEEC mailing lists and any other secure domain to be set up.
- JP: Revision of energy paper.
- JP: Benchmarks paper first draft.
- MG: Draft of first ML paper with aim of first draft at end of Sept. 4th September meeting afternoon for review of paper content.
- SC: Add new risk to risk register.
- SH&JB: Prepare finance for next meeting's planning session.
- All: Next meeting to review exploitation plan.