[MAGEEC] [mageec-magicians] Re: Structure of the Results Database

Fri Aug 30 13:25:03 BST 2013

Hi Ashley,

How do you envisage the passes and flag passes tables being used? 
Usually one flag triggers a set of passes. Also why is order needed in 
both flags_passes and passes?

Cheers,
James

On 30/08/13 12:14, Ashley Whetter wrote:
> Hello again everyone,
>
> Thanks for everyone's feedback so far. As per James' recommendation 
> I've removed the raw_results table and put results about a specific 
> run in the runs table, and summary data about a test in the test table.
>
> Also, after a discussion with Kerstin about how useful the database 
> structure would be for the machine learning it was realised that the 
> database wouldn't be storing any information about compiler passes.
> Therefore the diagram has been changed so that flag sets and feature 
> vectors are associated with a pass instead of a test. A test can also 
> have multiple passes.
>
> Here's the new diagram: 
> http://mageec.org/wiki/File:Results_ERD_Proposal.png
> I'll link to this page from now one so that it's easier to see 
> differences between different versions of the diagram.
>
> Thanks,
> Ashley
>
>
> On 29 August 2013 10:50, Ashley Whetter <aw0455 at my.bristol.ac.uk 
> <mailto:aw0455 at my.bristol.ac.uk>> wrote:
>
>     Right you are. I've updated the diagram:
>     http://mageec.org/w/images/1/19/Results_ERD_Proposal.png
>     I've also rename "time" to "timestamp" and "energy" to "power".
>
>     Ashley
>
>
>     On 29 August 2013 10:35, Munaaf Ghumran
>     <mg0950.2010 at my.bristol.ac.uk
>     <mailto:mg0950.2010 at my.bristol.ac.uk>> wrote:
>
>         I agree with James that the flags_tests and run tables might
>         need test_id as shared foreign keys.
>
>         Other than that, nothing I can see that stands out, seems good!
>
>         Moon
>
>
>         On 28 August 2013 19:08, James Pallister
>         <James.Pallister at bristol.ac.uk
>         <mailto:James.Pallister at bristol.ac.uk>> wrote:
>
>             Hi,
>
>
>>             A test (and test id) refers to a single combination of a
>>             platform, compiler, benchmark, and flag set.
>>             A run (and run id) refers to a single run of a test. A
>>             test can have multiple runs.
>             In the ERD, do the flags_tests and runs tables need test_id?
>
>
>>             The raw_results table is the bottleneck. We record
>>             200,000+ individual results for a single run, so this
>>             table will get really quickly.
>             I'm guessing this is the power trace directly from the
>             measurement board? If so, the fields should be run_id,
>             timestamp and power.
>
>             We may not want to store the entire trace in the database
>             - with millions of measurements, the database might get
>             unmanagably large - might be better if the raw_results was
>             just the time, energy, average power, peak power, etc for
>             that specific run.
>
>
>>             We could split this out into a different results table,
>>             but it's a one-to-one relationship 
>             The join between tests and runs should be changed from
>             one-to-many in the diagram to one-to-one.
>
>
>             Looks good from the energy measurement side :)
>
>
>             James
>
>
>             On 28/08/13 18:24, Simon Hollis wrote:
>>             Hi Ashley,
>>
>>             Thanks very much for starting this discussion. This is a
>>             really good starting point.
>>
>>             What I would like is that if everybody who has an
>>             interest in the structure of this database can provide
>>             their feedback on the structure that Ashley has proposed
>>             and see if it will work for their anticipated needs.
>>
>>             As I see it there are at least three interests we need to
>>             support: Energy Measurement; the MAGEEC framework; ML.
>>
>>             Perhaps all sides could outline the suitability of the
>>             proposed structure for their needs?
>>
>>             P.S. for Magicians: If you received this message, but not
>>             Ashley's original one, it is because you are not yet
>>             signed up for the external mageec at mageec.org
>>             <mailto:mageec at mageec.org> mailing list. Please do so!
>>
>>
>>             On 28/08/13 17:21, Ashley Whetter wrote:
>>>             Hey everyone,
>>>
>>>             At the last meeting we looked at the ER diagram
>>>             (http://mageec.org/wiki/Database) for the database that
>>>             would store the results that would be recorded by the
>>>             test framework and used by the plugin.
>>>
>>>             I've taken some of the comments made at the meeting and
>>>             made a more detailed ER diagram to discuss.
>>>             (http://mageec.org/w/images/1/19/Results_ERD_Proposal.png)
>>>
>>>             A test (and test id) refers to a single combination of a
>>>             platform, compiler, benchmark, and flag set.
>>>             A run (and run id) refers to a single run of a test. A
>>>             test can have multiple runs.
>>>
>>>             Currently the flag table is only really storing a flag
>>>             name (eg "-fgcse", "-fno-gcse", etc). I've kept this as
>>>             a separate table, though, because eventually we'll want
>>>             to start storing values for flags that aren't just on or
>>>             off. We could add this value field now, keep the table
>>>             as is for now, or get rid of flag_id all together and
>>>             just use the flag name in flags_tests instead.
>>>
>>>             I've put summary data in the runs table. We could split
>>>             this out into a different results table, but it's a
>>>             one-to-one relationship it would add an unnecessary
>>>             overhead of joining the runs and results table when we
>>>             want to search it. This isn't such a problem if we
>>>             search for results by run_id though.
>>>
>>>             The raw_results table is the bottleneck. We record
>>>             200,000+ individual results for a single run, so this
>>>             table will get really quickly.
>>>
>>>
>>>             Ashley
>>>
>>>
>>>             _______________________________________________
>>>             mageec mailing list
>>>             mageec at mageec.org  <mailto:mageec at mageec.org>
>>>             http://mageec.org/cgi-bin/mailman/listinfo/mageec
>>
>
>
>             _______________________________________________
>             mageec mailing list
>             mageec at mageec.org <mailto:mageec at mageec.org>
>             http://mageec.org/cgi-bin/mailman/listinfo/mageec
>
>
>
>
>
>
> _______________________________________________
> mageec mailing list
> mageec at mageec.org
> http://mageec.org/cgi-bin/mailman/listinfo/mageec

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mageec.org/pipermail/mageec/attachments/20130830/c1523e93/attachment.html>