[MAGEEC] [BEEBS] Plackett-Burman Initial Review

James Pallister James.Pallister at bristol.ac.uk
Mon Aug 11 18:46:23 BST 2014


Hi George,

>
> I was aiming for a clustered bar chart, but instead settled on stacked
> columns, as there were far too many data points and the chart was
> cluttered. To explain the chart: I've plotted the percentage change
> between the average energy usage of each benchmark with each pass
> enabled and disabled. Thus, a negative value shows that the pass
> reduced the energy usage of a benchmark. In terms of the chart
> produced - bands below the x axis are where the benchmark had reduced
> energy,  whereas those above used more.
Looks good (although slightly difficult to interpret - lots of data
points). It seems that all of the optimizations are having an effect on
the energy consumption, so we can't exclude any of these. Perhaps a box
whisker plot for each pass would give a good idea of the distribution of
results.

> One thing I noticed is that 2dfir had disproportionately large
> magnitudes in the energy changes. Therefore, I excluded 2dfir from the
> chart linked earlier. I will see if this changes after I've pulled in
> recent beebsv2 changes.
Looking at the raw data, are we certain this is correct? These
measurements seem really large (could there have been an anomalous
reading?) It may be worth repeating the experiment for this benchmark
and seeing if you get the same results.

> I believe comparing the means of enabled vs disabled is the way to
> determine main effects. However, I'm not sure how to determine whether
> or not the difference is statistically significant - if you look at
> the raw data
> (https://github.com/ks07/beebs/blob/plb/plb/rudimentary_analysis.txt),
> a large portion of the energy changes are very small (for example,
> 1.953e-14%).
We should be able to look at the raw data (i.e. non-averaged, data from
each run), and do the mann-whitney test, to work out whether the two
distributions are significantly different or not.

>From your raw data, here is a hinton diagram:

Black indicates a decrease in energy, white is an increase, size is the
delta % column. The benchmarks are horizontal, and the passes are
vertical. I'd have to agree that we should exclude the gdb-* tests. I've
also excluded the 2dfir benchmark. sglib-arraybinsearch also benefits a
lot from the optimizations (may also be worth reinvestigating).

Interesting data :)

James

On 11/08/14 16:20, George Field wrote:
> Hi all,
>
> I've just finished doing a bit of analysis on a small subset of GCC
> passes for BEEBS. I still need to pull in some of the recent changes
> to BEEBS, namely deleting the benchmarks that are no longer part of
> the suite (the gdb-* benchmarks seem to be skewing the results
> somewhat) - but the results are still interesting.
>
> I've ran 16 tests 3 times, testing the first 12 optional passes.
> Possibly the most interesting thing I've produced from the energy
> measurements is the following graph:
>
> https://raw.githubusercontent.com/ks07/beebs/plb/plb/main_effects_test.png
>
> I was aiming for a clustered bar chart, but instead settled on stacked
> columns, as there were far too many data points and the chart was
> cluttered. To explain the chart: I've plotted the percentage change
> between the average energy usage of each benchmark with each pass
> enabled and disabled. Thus, a negative value shows that the pass
> reduced the energy usage of a benchmark. In terms of the chart
> produced - bands below the x axis are where the benchmark had reduced
> energy,  whereas those above used more.
>
> You'll notice that no GCC pass was universally good or bad wrt the
> energy usage of our benchmarks. However, it's clear that the majority
> of passes have a tendency to either improve or impair the energy
> usage, on average.
>
> Another, more detailed look at the main effects shows the 3 best, and
> 3 worst passes for each benchmark.
>
> https://github.com/ks07/beebs/blob/plb/plb/best_passes.txt
>
> In this file, you'll see the name of the benchmark, followed by the
> best 3 passes and the percentage change on their energy usage.
> Following that are the 3 worst.
>
> One thing I noticed is that 2dfir had disproportionately large
> magnitudes in the energy changes. Therefore, I excluded 2dfir from the
> chart linked earlier. I will see if this changes after I've pulled in
> recent beebsv2 changes.
>
> I believe comparing the means of enabled vs disabled is the way to
> determine main effects. However, I'm not sure how to determine whether
> or not the difference is statistically significant - if you look at
> the raw data
> (https://github.com/ks07/beebs/blob/plb/plb/rudimentary_analysis.txt),
> a large portion of the energy changes are very small (for example,
> 1.953e-14%).
>
> Thanks,
> George
>
>
> _______________________________________________
> mageec mailing list
> mageec at mageec.org
> http://mageec.org/cgi-bin/mailman/listinfo/mageec

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mageec.org/pipermail/mageec/attachments/20140811/e33df7e5/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ibejddgd.png
Type: image/png
Size: 121512 bytes
Desc: not available
URL: <http://mageec.org/pipermail/mageec/attachments/20140811/e33df7e5/attachment-0001.png>


More information about the mageec mailing list