none
Automatic custom instruction generation for eMIPS (Migrated from community.research.microsoft.com)

    Pergunta

  • Hi all

    i want to test a simple case with the eMIPS but it looks like it is going to be a long walk. Is it possible to perform the following experiment without much excess complexity?

    This is what i want to do:

    a) compile a small self-contained (no I/O) C application, such as an iterative Fibonacci sequence generator

    b) have bbtools automatically identify custom instructions/instruction-set extensions

    c) use Giano (or Modelsim) -- either one but not both if possible -- to get a basic block profile

    d) use the basic block profile to sort out the most significant custom instructions in terms of cycle acceleration

    e) automatically reinsert them to the binary (bbpatch or bbrewrite?)

    f) reevalute with Giano, Modelsim or what is more appropriate

    g) get a clean view of the cycles before and after, e.g. 10000 cycles before vs 4000 cycles after. This of-course assumes that cycle time is not affected.

    I understand that M2V is the actual high-level synthesis tool that should be used for generating the custom instruction units for the augmented eMIPS, but a faster estimate with Giano (?) should be appropriate for the start. In case M2V is necessary, I could with that too.

    I believe this is a direct application of the eMIPS framework. However, the test files (although many) are messy and hinder the user from using the eMIPS distro in a proper way.

    Best regards
    Nikolaos Kavvadias
    Adjunct Lecturer
    University of Peloponnese
    Greece

     

    REPLY:

    Hi Nikolaos,

     

    Welcome aboard J 

     

    The short answer is “only with the current version”, which is not yet released L. 

    With the December distro you can do it, but *with* a lot of complexity.

     

    The steps are only a little bit different from what you listed, as follows.

    a) compile a small self-contained (no I/O) C application, such as an iterative Fibonacci sequence generator

       You can use the standalone mode for this. Use any of the provided tests for starters.

         Say your program is called “foo.c”, you will say

                   c_compile foo 

    b) have bbtools automatically identify custom instructions/instruction-set extensions the basic blocks

         In this step actually you actually identify just the basic block positions, for Giano’s use: 

                   bbfind -T 80000000 -s foo.nm -b foo.bbs foo.rel

    c) use Giano (or Modelsim) -- either one but not both if possible -- to get a basic block profile

         Correct:

                   giano -Platform Ml40x_2ace.plx Giano::OverwriteBbsFiles 1 GPIO::ValueAtReset 4 SRAM::PermanentStorage foo.bin eMIPS::Implementation mips_bbt eMIPS::BbFile foo eMIPS::BbStart 2147483648 eMIPS::BbSize 40392

         The many parameters look pretty scary, but the only variables actually are the name of your binary (“foo”) and the size of the foo.bin file (“40392”).

    d) use the basic block profile to sort out the most significant custom instructions in terms of cycle acceleration

         With the December release you only have bbsort/bbdump:

    bbsort -r - execycles foo.bbs foo.sort.bbs

    bbdump foo.sort.bbs foo.nm > foo.sort.bbs.dump

             and the input to m2v is then created by hand, by editing the output of bbmatch:

                   bbmatch -c x foo.sort.bbs > foo.bbw

             You will edit this file and leave only the description for the block you want, say into HB0.bbw

             In the un-released version there is a new tool to do the exploration automatically:

    bbexplore foo.sort.bbs

             This generates the best candidate directly in a file called HB0.bbw, then alternates into HB1 etc . This tool is the work of our last intern, Haris Javaid.

    e) automatically reinsert them to the binary (bbpatch or bbrewrite?)

        This is slightly longer, actually.  First off we need to produce the accelerator itself:

                   m2v –t HB0.bbw foo.v

                   <use the Xilinx tools to compile foo.v and produce the partial bit-stream foo.bit>

            There is now a simple command line script that will do that last step. We also have a number of required bugfixes for M2V.

            Next you will indeed patch your binary:

                   bbfind –T 80000000 –m HB0.bbw foo.rel foo.patched

    f) reevalute with Giano, Modelsim or what is more appropriate

         This step is still missing from Giano, unfortunately. We have a plan for it, but we ran out of time with Haris before implementing it. The idea is to tell Giano (in a separate file) what the alternate timings are when executing the accelerated instructions, and where are the transitions back to regular timings. The bbexplore tool would generate this file.

             What we actually do today is to use the actual implementation (e.g. the board).  This requires creating the accelerator, and including it into the executable image.  For standalone this is semi-manual, look at  how the mkall.bat script produces the tloop_ld.bin test for details. The program must enable the accelerator by itself (again look at tloop_ld.c for details)

             For NetBSD there is a utility that does it, called ace2se:

                   ace2se –ph1 foo.accelerated foo.patched foo.bit 0 2000

             and it is the OS itself that does the loading/enabling etc.

    g) get a clean view of the cycles before and after, e.g. 10000 cycles before vs 4000 cycles after. This of-course assumes that cycle time is not affected.

                   You can use the cycle counts from the timer, e.g. look at how the mmldiv64  tests do it.

                   In the internal version we have added a MIPS-compatible on-chip cycle counter that is easier to use. On NetBSD, that is the only option in user mode for fine-grained timings.

     

    The current version is not yet released first off because of all the missing features. Then because it needs testing before we give it out. And ultimately because we do not have enough time L 

    If you are willing I can provide you with what we have and give you further guidance offline.

     

    Thanks for your interest and for your patience,

    sandro-

     

    REPLY:

     

    Hi Sandro

    thank you very much for your detailed and to-the-point answer!

    I will let you know how it went by posting here my experience and pasting the files (e.g. shell script, Makefile) for future reference.

    Best regards,

    Nikolaos Kavvadias

     

    quarta-feira, 22 de junho de 2011 18:28