Picking up where I left off: Time Analysis



Download 54.13 Kb.
Date28.01.2017
Size54.13 Kb.
#9620
Picking up where I left off:

Time Analysis:

To pick up from where I left off will require a few steps to get started. I’ll try to lay things out and make it as easy as possible. This project is about analyzing C programs as they are running on the ARM processor.

You will start with a C program (a benchmark) and run that through a program that compiles and simulates ARM architecture. This program is called uVision (micro vision). This program will give you two things: the time and the assembly code needed for our project. I wasn’t able to figure out a good way to produce an assembly file from uVision, so I would copy and paste it from the debugger screen when in disassembly mode. I recommend looking at Zach’s video to get started with uVision.

Here are some other things that I found out about uVision. Originally we thought we needed to change xtal MHz from 12.0 to 60.0 (this is when you go into options for Target 1), but we determined that it should stay at 12.0 later this year. Make sure you comment out all printf statements in the benchmark. When you are in the debugger mode and trying to get the assembly code, go to the disassembly view, right click and select Assembly Mode instead of Mixed Mode. One thing that is convenient in the debugger mode is you can right click in the disassembly mode and go to execution profiling, you can enable either show time or show calls. You will find both of these to be useful. I have set up a lot of the uVision files for you. You will find these in a folder called uVision Projects in the zipped folder you should receive from Dr. Healy.

After you copy the assembly code from uVision, you should paste it in a notepad document and save it as a .s file. You will then need to run it through a Java program I wrote called converter. This will produce a .ss file that you can then run through RALPHO. The converter program simply changes the format of the data to what RALPHO is expecting. There was not too much focus on this small program, but there are a few quirks I implemented into it. You will find these in my journal.

Once you have the .ss file, it is time to run it through RALPHO. RALPHO is a python program that is named ralpho_arm.py. To my knowledge, RALHPO should work well. The purpose of RALHPO is to turn the .ss file into a .inf file. RALPHO requires an ADF file. This stands for Assembly Definition File. The purpose of this file is to provide information specific to the ARM architecture. It contains information about opcodes and mnemonics. The .adf file we are using is called armv4a.adf.

Once you get your .inf file, it is time to move to the timing analyzer itself. The first thing you need to do is create a .ist file. You can do this by running the program inf2ist. Once you do this, you can run time.bin –iv (-v if you want verbose output, which I recommend when debugging) [filepath]. This will give you outputs. I know it seems like a lot of steps to get here, but once you do it a few times, you’ll realize it’s not that bad. The hardest part is figuring out where the errors come from and what you should tweak.

To edit RALHPO I used a program called Aptana Studio and to edit converter.java I used Eclipse. Other than that, I edited everything with the timing analyzer on Linux using emacs. I always ran things through Linux as well because I found it was more convenient.

When you unzip my file, you will see a 2012 and 2013 file. 2012 is everything I was given. A lot of it is repeated in 2013, but most of it is updated in 2013. This is the layout for how I would run a file through the timing analyzer from start to finish.



  1. Start with the benchmark (we’ll say simpleArray.c)

  2. Create a uVision project (follow Zach’s instructions)

  3. Copy paste the assembly file into a text document (simpleArray.s)

  4. Using filezilla or WinSCP upload this to the file /2013/ralpho/

  5. Run it through the converter (java converter simpleArray.s)

  6. Run the new .ss file through RALPHO (python2 simpleArray.ss)

  7. Move the inf_file to the folder of inf files in time (mv simpleArray.inf ../time/inf_files/)

  8. Go to the time directory (cd ../ti me)

  9. Run the inf file through inf2ist (inf2ist inf_files/simpleArray) [do not use .inf at the end of the file]

  10. Run the program through the timing analyzer (time.bin –iv inf_files/simpleArray)

    1. If you want verbose output, (time.bin –iv –v inf_files/simpleArray>simpleArray)

My Journal:

May 28, 2013

Today was day one of my research. I began the morning by reading a detailed synapsis of what Zach did last year with the project. I then went to the library and checked out two books about programming in C. After getting my account up and running on a different server, I spent the afternoon exploring the different features of Linux and editing programs written in C. At one point, I could not yet access my new server so I was using the old one. Once I gained access, I successfully transferred all of my files from the old server to the new one. Another success I had today was downloading files from Dr. Healy’s website using a Linux command called wget. So far today has been about getting acquainted in my new workspace, becoming more familiar with a different operating system, and becoming more proficient in programming in C. Tomorrow Dr. Healy and I will further explore the actual project at hand.

May 29, 2013


Today started the research of the actual project! This morning I finished up reading about programming in C. I guess I’ve mastered the language in two days, but I’ll keep my reference books just in case. Jumping into the middle of this project has been kind of confusing, but I’m finally getting a handle on what’s going on. The basis of what I’m working on now is figuring out how Zach did a few things so that I can do them myself and also to pick up from where he left off.

The first task of today was figuring out how to create assembly files (.s) from a C program. I think that if I use a program on Zach’s computer (uVision), it gives me the information which I can then copy and paste into an editor to be an assembly file. It would still be nice if the program created one for me. I’ll keep playing with it!

I figured out what an ADF file is, and though we found many in the zipped file from last year, we decided to use armv49.adf for now. This file has necessary information about the hardware for RALPHO to run. I should also make sure this is a complete and correct file. My next task is to test what I’ve done so far. I need to write some simple C programs and run them through the compiler to get an assembly file and then run that through RALPHO. Through time, I should be able to troubleshoot the program’s bugs. However, before this is possible, I need to thoroughly understand what the output of RALPHO is telling me. It seems possible that a lot of the code at the beginning of the output is just startup jargon, stuff that isn’t important to my programs. It also seems as though there might be the option in RALPHO to make .xml and .inf files, maybe both? While I’m debugging RALPHO, I need to make sure it is complete and see what all the features are that it currently has. Hopefully in about a week RALPHO will be able to identify the following in a C program: loops, number of loop iterations, blocks, instructions, etc. Once I finish with RALPHO and understand the output files, I will get to move on to the time analyzer, which will input these files.

May 30, 2013

Today I made some really great progress. I feel like I might be pretty close to where they left off last year! I started out this morning by looking up and confirming that it’s possible to put uVision onto my computer. However, right now we don’t need that! We were compiling through this program to get the assembly code, however we figured out today that we can compile through the raspberry pi machines and it will produce an assembly document for us. This is much easier than what we were doing! Dr. Healy also took a good amount of time to explain assembly language to me today. With this advance, I got to move on and start using RALPHO!

I started out by writing some very simple programs that just used one loop and an array. With these programs, I was able to understand the assembly code that I produced and I could follow RALPHO’s output. RALPHO seems to be doing well. I did find the error that Zach admitted was there. When there is a function that starts with ‘r,’ RALPHO thinks this is a register and disregards it as a function. This is kind of problematic. Tomorrow I think we’re going to focus more on how to solve this problem. There is also one other detail we are going to focus on tomorrow which is looking at the macros to see if it correctly handles these or not. It does produce some sort of error message when running RALPHO but it also produces an INF file. We will see what’s going on with that tomorrow!

May 31, 2013

Today I started debugging. This is always tough. I don’t feel like I accomplished as much today, but that’s because we were looking for the wrong answers. I spent a large portion of the morning looking for opcodes that we have come to the conclusion don’t exist. As of now, we don’t think we will be able to analyze C programs that use floating point. We need to stick with integers. The compiler on the simulator uses a function for floating point calculations. We might have to implement this in the future. I have also spent the day trying to figure out why some of the instructions in the assembly file don’t transfer over into the inf file that RALPHO produces. There’s still something kind of funny about what is happening with RALPHO printing invalid instructions all the time. Next week will be spent further debugging RALPHO and determining which bugs really do exist and which ones need to be fixed.

June 3, 2013

Today I finally got all of the needed programs on my computer. Zach helped me move uvision from his computer to my computer and I also downloaded Aptana studios which is supposed to help me with debugging Python programs. After figuring out that using gcc –S was not producing the assembly code we wanted, we decided to we needed to figure out a way to use the assembly code from uvision. Today I wrote a small program called converter to read the input of uvision assembly code and output a file so that it is formatted similarly to gcc.

June 4, 2013

This morning I finished up the converter program I started writing yesterday. I spent a large portion of today examining RALPHO and trying to understand why some of the instructions were omitted in the INF file produced. I found that uvision formatted one of the instructions differently and I am now trying to figure out how to accommodate for this. When uvision produces the assembly code for and add function, it uses four different operands instead of three. We had to go into the ADF file and add commas to the operands beside the opcodes to accommodate for this, however, that of course caused some new issues. One specific problem was encountered near the end RALPHO where we had a tokenizer. By adding the extra comma, it was producing a new token, but the tokenizer wasn’t acting accordingly. We solved this was by counting from the right of the array instead of from the left. This was a cool thing I didn’t know about that you can do in python. I think there are still a few more issues with RALHPO but I will focus on those tomorrow.

June 5, 2013

Today was a productive day. I got several programs to run through RALPHO. As of now, I’m pretty confident in MOST of RALPHO. There are still a couple of errors to be fixed, but hopefully they are small. Today’s big progress was fixing yesterday’s issue. In more detail here is what happened. When uvision compiles a program, some instructions are printed differently than when gcc compiled these programs. We sort of knew this and that’s why I wrote the converter program, but today we found another specific. I’m going to talk about the instruction ‘add’ though this happened with other instructions as well. With the add instruction, the ADF file was expecting 3 registries with another instruction explaining what to do (example, LSL). The first thing I had to solve was adding commas in the ADF file so that it would tokenize all operands (for example: r,r,r,LSL#0). After doing that, I had to rewrite a small portion of the code in RALPHO to read in all of the tokens, not just a set amount. There were still some instructions that didn’t get recognized, so after examining the hexadecimal code we came to this conclusion. If the hexadecimal digit is a 0 or a 1, and there are two or three registries given, we must have a shift amount for the instruction. If one is not given, then we are going to default it by doing a left shift 0. So, we need to add LSL#0 to the operands. This is accomplished in the java program converter. There’s one exception that I’ve come across so far, and that is for the compare. Though the compiler produces the mnemonic cmp, by examining the hexadecimal digits that accompany the mnemonic, the compiler actually uses the opcode that corresponds to the mnemonic “cmps” and cmps only wants two registers. This makes sense because you would only be comparing two things. The issues that I know still exist in RALPHO are, for one, it wants the main method to be first, and secondly, we still have the r mystery (it doesn’t pay attention to functions that start with the letter “r”).

June 6, 2013

I think I’m getting closer and closer to being finished with RALPHO. But, I’m certainly not there yet. Today I figured out a nice little bug that I need to fix. There are condition codes that RALPHO tries to take out of the instruction codes that it reads. Sometimes, however, there are also instructions that have these two letters included, and when RALPHO takes out what it thinks is a condition code, it affects the actual mnemonic of a real instruction. I was hoping that I could just comment this code out, but I did run into a file that used the condition codes so I have to go back and add some conditional statements to the code. I also made some progress on the missing block problem. I rewrote some code to correctly add function names into an array trying to store the function names. The problem was it was trying to store what it thought was a function name but it also included the memory address. So, when it would compare a label name to the function names, it wouldn’t be an exact match. (Ex: Function name = is_symmetric(0x0000024C) label name= is_symmetric new_function_name=is_symmetric) This should allow ‘main’ to be anywhere in the program. However, bubblesort is not showing all function names right now and Fresnel2 is missing some blocks. So I will have to look into that mystery tomorrow. I also got some errors from uVision when I tried to build the targets, so I will have to look into those tomorrow as well (L6406E and L6407E). Dr. Healy won’t be here tomorrow, but I have a lot of little things to work on. Hopefully when he returns I’ll have some good news and positive things to show him!

June 7, 2013

I was successful at resolving two of the RALPHO issues today! Though I made an attempt at a third, there’s going to have to be more work done on that one. Here’s what I did accomplish. When I was running the benchmark bubblesort through RALPHO, it was not acknowledging that bubblesort was a function. The reason it was doing this is because when it read in the line “bl BubbleSort(0x0000028C)” the instruction was to replace all instances of the instruction (bl) with an empty string and then remove white space. When this was done, the resulting string was “BubeSort(0x0000028C)”. So my fix for this was to have it split into a list at the space. BubbleSort is now recognized as a function. I don’t know how many times we would have seen this, but it should be fixed. The other big accomplishment today was I fixed the “r” problem! The problem with this was it was looking at the operands and assuming that if the first letter was an ‘r’ then it was a registry and not a label. So, to solve this, I put a conditional in there that took the length of the operands after splitting it every time there’s a comma. If the length is only 1, then it must be a label or a bx statement with a registry value. This seems to have solved that problem. The problem that still resides goes back to the conditional codes. I went through the ADF file and found all instruction codes that would be affected if a conditional code was taken out of it and told RALHPO not to take letters out of those. However, there are a few instances where an instruction name appears to have 3 letters of conditional codes. I will have to talk to Dr. Healy about how to solve this on Monday. The other problem that is left is with Fresnel2. I didn’t get a chance to look at this today, but I know the INF file does not start with block one. I will investigate that problem more on Monday as well. I looked up the errors today that occur in uVision. I’m not exactly sure what they mean, but I know it has something to do with running out of memory. All in all, this has been a very productive week. I predict we will be finished debugging RALPHO very soon.

June 10, 2013

I didn’t accomplish as much today as I was hoping. This morning was spent trying to figure out why fresnel2 didn’t work correctly on RALPHO. The reason was there were no function calls because the program was incomplete. Dr. Healy and I went to the library to look up the program and complete it. I spent the afternoon trying to figure out how to view output through the ULINK2 and the ARM processor. I have everything connected, and can do the example programs provided, but haven’t been able to figure out how to run my own code through it yet.

June 11, 2013

This morning I finally figured out how to display output using uVision. You must include some code in the source file and also add 2 other c files to the project. I then ran every benchmark program through uVision for the purpose of getting the assembly code and running that through RALPHO. This yielded some pretty good results. There are still a few programs that use coprocessor codes and some that have the condition code extraction problem, but most of the programs ran through successfully. The next step is to modify RALPHO so that it counts the number of loop iterations. To do this, I am going to have to detect a loop and within that loop detect the initial value, the limit, and the increment amount. After this, RALPHO should be good to go!

June 14, 2013

This week was a very productive week. The highlight of the week was adding the loop iterations counter to RALPHO. As of now, RALPHO can detect the number of loop iterations for any loop and if it is a nested loop it can detect that and can decipher the information. I need to refine a few things next week. One example is in the benchmark program neville. When RALHPO is searching for the increment value, it encounters instructions to modify the increment variable in different ways during the body of the loop and not just the control variable increment amount. I will have to take a closer look at this. Also, I have future work for this feature: Be able to detect the limit when it is a variable plus a constant and be able to detect the limit and initial value when it is a variable multiplied or divided by a constant. The other thing I will have to continue to refine in RALPHO is the inf output. I need to modify it a little more to match the INF inupt the time analyzer is wanting. This should not be very hard. The output has to be different whenever there is a nested loop versus when there is just a standalone loop or an outer loop.

June 18, 2013

I formatted the output of RALPHO’s INF file so that it included all of the loop information. We have moved on to working on the time analyzer. Right now the problem we are having is there is an infinite loop in path.c when evaluating the path of the loop nodes. It has been very tricky to detect the problem with this because the algorithm should be correct and it is probably something with the INF file.

June 25, 2013

Today marked a day of a lot of progress. Today I was able to successfully run benchmarks through the time analyzer! The problem that I faced earlier in the time analyzer was taking out what I can only guess was a “quick fix” last year. The input file for the time analyzer was expecting to have the mnemonic at the end of the instruction line in the inf file, but RALPHO doesn’t do that. Once I stopped the file from scanning for this extra token, it eliminated the infinite loop. I then had to go back and edit RALPHO a little bit more to ensure that it had the loop detail correct that I added before, and to put the loops in a different order. Previously, RALHPO was listing the loops in sort of an inside out way. If there were nested loops, the innermost loop would be 1. I fixed this so that the outer most loop is one. There wasn’t much testing done for these changes and I’m sure there are still some nested loops that will confuse RALPHO. I need to go back and thoroughly test this and try to fix some of the errors that occur. I also had to edit the adf file again. This was similar to what I did the first time. Basically, uVision created more tokens than were in the adf file, so I needed to add a comma to fix this. Today I was able to run 10 programs through the time analyzer and get results for them. There were five programs that failed to go through the timing analyzer and seven that failed to go through RALPHO. One of the next steps will be to see how precise the timing analyzer really is as well as further troubleshoot these errors.

July 1, 2013

Today marked the first time I was able to successfully run a program through the timing analyzer that had more than one function. I had to make a few small changes in the timing analyzer to detect function calls because of the addition of the forth operand in ARM. I also discovered two big changes I need to make to RALPHO. Currently, RALPHO does not print the name of a function when printing the instructions and the instruction is a function call. The other change is RALHPO numbers the blocks starting with 1 and adding for every block in the program, but, should start over with block one every time there is a new function. Today I manually made these changes to the INF file and was able to run it through the timing analyzer. I also implemented the function iscall in the isa file. This required me setting a variable UCALLI to “a00” which is the opcode for function calls.

Today I found the function get_inst_cycles. This function is not currently implemented, but, I’m thinking if implemented, it may account for some of the errors we are observing when running the timing analyzer. The reason this function doesn’t work yet is because the function determine_inst_name was not implemented last year due to time constraints and may be a little trickier to do this year. This function takes the opcode and looks up the mnemonic. The reason this seems like it may be kind of tricky is because of the way inst_list is read from instset.arm7tdmi.

Lastly, I still need to come up with a way to write the function is_branch and I know I have a lot of work still to do involving the pipeline analysis.

July 3, 2013

Between yesterday and today, I have been working primarily on four things to get ready to work on pipeline analysis, but this has led to working on many more things. The first thing accomplished was creating another variable that converted the opcode read in from the INF file from a string to an integer. This was easy! The next thing was making sure the instruction list was read in correctly. I wasn’t surprised to find that it wasn’t actually being read in at all. I updated this function, and read everything in from instset.arm7tdmi and accounted for things such as the * representing 16 different possible opcodes for the same mnemonic. This function now works, but I need to modify it. I’ll get back to that in a second. The purpose of reading in these opcodes and mnemonics is so that I can use the function Determine_Inst_Name, which is passed an opcode read in from the INF file, and finds the respective name. This would then be used by the function get_inst_cycles to determine how many cycles each instruction would take. The current get_inst_cycles is a not exactly what we want and will have to be rewritten, but I already knew this. The pipeline analysis will replace this function. In the process of doing all of this, I found some inconsistencies with opcodes. I had to make two major changes. The first really big change was changing all mnemonic instruction names outputted from uVision as “cmp” to “cmps.” We only have pseudo opcodes for “cmp” and when you look at the opcode given by uVision, it is the same as “cmps” even though it uses the mnemonic “cmp.” This change occurs in the converter program. Another inconvenient change I had to make was done in RALPHO. It also turned out that we only had pseudo opcodes for conditional branch instructions, so in RALPHO, when looking up the opcode of a conditional branch, it temporarily treats it as if it is an unconditional branch statement. This will affect reading the output of the timing analyzer; however, it doesn’t affect the timing analysis and if you look at the opcodes that uVision provides for conditional branch statements, they are the same as those that are used with unconditional branch statements. Future work will involve passing the correct mnemonic to the timing analyzer, but for now we are going to move on.

For now, I have a lot of future work that needs to be done within the next week or two. I can only get the timing analyzer to run on a few programs that don’t have more than one function, and though it is more accurate than before, it is not correct. I need to do more investigating to figure out why the time analyzer doesn’t run on a function with more than one function. For RALPHO, I have 2 big things that need to be changed. I need to make sure that when there is a function call, it prints the name of the function at the end of the instruction line. I also need to renumber the blocks starting with 1 every time there is a new function. In the timing analyzer, I need to rewrite the function I just wrote that reads in all of the instructions and opcodes. Instead of using the file: instset.arm7tdmi, I need to use worst_stages_arm7tdmi. This has all of the same information as instset.arm7tdmi, but it also includes the cycle information I need for the pipeline analysis. These cycle times need to be in the “dictionary” part of the program so they can be looked up and attached to each instruction. I am currently planning on attaching these numbers to each instruction at the same time that the instruction looks up the mnemonic by passing in its opcode. Then, I will be ready to start implementing the pipeline analysis.

July 15, 2013

While Dr. Healy has been away, I have accomplished a lot. With respect to RALPHO, I have added the two features I needed to add: Starting the block count at 1 for each new function and adding the name of the function called to the end of the instruction line if the instruction is a function call. One thing Dr. Healy would like me to implement still is having RALPHO detect the number of dynamic instructions for each block.

My big accomplishment was in time.c. I have added the pipeline analysis! This was very tricky and involved a lot of code. I would still like to look over the code and see if there is a less verbose way to accomplish what I have added. I also need to check to make sure the code works for all cases. Since there are still a few issues with RALPHO outputting the correct INF file for the timing analyzer, this is tricky. I think I’m going to spend the remainder of my time trying to clean up RALPHO and timing analyzer errors.

I have also been exploring the functionality of uVision. I have found a way to show the timing of each instruction and timing of each function. There is also an option that shows how many times each function and each instruction is called. I think this will be very useful as we proceed with checking the timing analysis.

I solved another issue that has been occurring recently with the timing analyzer. I wasn’t able to run programs through it with multiple functions. The problem was the name of the function contained the underscore, so when the function call was being compared to the function names, it wasn’t matching correctly. The name should no longer include this underscore when it is being compared.

July 19, 2013

Since Dr. Healy has returned, we have made a lot of progress. As of now there are a few programs that I can run through the timing analyzer and get correct results. There were a few things I needed to change in the timing analyzer.

We determined that we needed to add a break statement in the function Time_Worst_Case for the inner do while loop. We added this because the comment says we break out of the do-while if we have no more fm or fh or we run out of iterations. This is has to do with cashing. Since we aren’t dealing with cashing right now, we needed to add a break statement. This is also going to change the way the timing analyzer calculates the number of cycles. Before we added the break statement, it was looking at each individual loop iteration instead of multiplying by worst case and the number of loop iterations. We added an else statement to accumulate the total time by multiplication. This makes our code more efficient.

One thing that is different about the ARM compiler is the way assembly code is produced for loops. When there is a loop, there are usually two different paths now: continue and exit. Before, it was not as common for the exit path to be different from the continue path, that is, there was one path and it was both continue and exit. Now, however, the ARM compiler produces a specific exit path and specific continue path. Because we have this exit path, I had to add a little bit of code in Time_Worst_Case to look at the exit path and add that to the total time. It was essentially replicating what we did for the continue path, but for the exit path.

I made some updates to the branch penalty and I think I have a better idea of how to determine if a branch was taken or not. This is kind of tricky because of the issue we had earlier with everything going to the timing analyzer as an unconditional branch statement. However, basically I look at the next instruction in the path (sometimes it is in the next block) and if the instruction number is one more than the current instruction, you know the branch wasn’t taken. The tricky part is dealing with branches at the end of paths.

The big thing that Dr. Healy and I discovered yesterday has to do with pipeline analysis. I was working on getting the answer more correct for a small program that I wrote called fun. This program has a loop that calls a function that returns 5. The way we were first doing the pipeline analysis wasn’t getting the same answer as our simulator. After analyzing the problem for a bit, we determined the way we were doing the branch penalty and dealing with function calls wasn’t correct. Instead of the branch penalty being 2 additional cycles that were added on to the total time, we determined that we need to add the branch penalty into the pipeline analysis. By adding the branch penalty into the pipeline analysis, we determined that we did not have to wait on an instruction as our previous analysis said. This saved a cycle, and made our answer more accurate.

July 26, 2013

When trying to get the answer for sum10.c to be correct, we ran into a problem. Our pipeline analysis was showing that adding another instruction would not change the time, however uVision’s simulation showed that the extra instruction still had execution time. It was for this reason that I focused my efforts on downloading and installing Gem5 to get another perspective. I wasn’t able to get complete results from Gem5, however, I have a separate document explaining everything that I did find from Gem5.

July 30, 2013

During my final week of research, my main objective is to get as many results as possible. Right now, I have been identifying the programs that won’t run through the timing analyzer, RALPHO, or uVision. I spent a little bit of time trying to get Bubbesort to run through the timing analyzer, but decided it would be a better use of my time to try to get more programs through RALHPO.

I was working on getting expint to run through RALPHO but was having difficulty because RALHPO was detecting more loops than there actually are in the program. This is because uVision created, what we determined to be, unreachable code. At this point, we have decided to move on from this program. Tomorrow I am going to explore more programs that don’t run through RALPHO.

August 1, 2013

I have pretty much finished up everything I am going to be able to accomplish this summer. At this point I am trying to make all of the data easy to read and put in a way that makes it easy to understand to help the next researcher. I have created an excel spreadsheet that shows the results of all the benchmarks we have used thus far telling as much about the results as we know. I am going to refine my journal a little bit to avoid contradicting myself and make it more straight forward. Lastly, I am going to look for a few last minute benchmarks from some sources Dr. Healy learned about at his conference.

I have created a folder called extra benchmarks. These are benchmarks that I found but didn’t have time to run through everything. I didn’t include any floating point programs because I’m not sure how well we handle those, however I did include programs that have double. I didn’t get to test double vs. int vs. float so I’m not sure how well those will work either. The benchmarks are in subfolders that indicate from which source I got them.

I thought I should leave you with a word of advice. The timing analyzer contains folders that have the same name but with different capitalization. This is OK for Linux, but not for Windows. When trying to transfer these files to a Windows machine, you will quickly be frustrated when you are overwriting files because of capitalization issues.

I would also like to note that before I started, I had no experience with programming in c . I used C Programming A Modern Approach Second Edition by K. N. King from the Furman Library.

FINDINGS:


  • The time cannot be calculated by looking at how long a singular instruction takes and looking at how many dynamic instructions there are. Time is calculated by examining the pipeline analysis.

  • When doing the pipeline analysis, there is a branch penalty which must be included in the actual pipeline analysis, not added on to the end. We figured this out when examining the pipeline for fun.c.

  • ARM assembly code uses 4 operands

  • With the ARM compiler we are using, when there is a loop, there is more than one path. For every loop there is a continue and exit path instead of one path that acts as both

  • Different ARM compilers produce different assembly code for the same program

  • It is possible to calculate the number of loop iterations of nested inner loops even if the limit of the loops is dependent upon the outer loop

  • Sometimes the compiler provides a mnemonic that doesn’t always match up with the opcode. This usually happens with branch instructions.

  • It is important that the INF file prints the loops in the order in which they are found

ACCOMPLISHMENTS:

  • Creating accurate assembly files and writing a program that converted them into a format that RALPHO can understand.

  • Making small changes to the ADF file to enable more accurate instruction look up so the INF file produced from RALPHO is complete.

  • Ensuring all blocks are in the INF file and all instructions are in the respective block

  • RALPHO now includes all necessary loop information in the INF file, things such as: loop limit, initial value, increment amount, and number of loop iterations.

  • RALPHO can now handle programs with multiple functions, even if one of the function’s name starts with an “r”.

  • Editing RALPHO to restart block numbers at 1 for each function in the program.

  • Having RALPHO correctly print information for function calls.

  • Run programs (with multiple functions) through the timing analyzer

  • Get 100% accuracy for about 5 programs running through the timing analyzer including one with a triply nested loop and one with more than one function

  • Figuring out how to display output when running a program on the ARM Architecture

  • Learning how to get more accurate information from uVision and use more of its features

  • Creating a dictionary of all instructions with opcodes in the timing analyzer so the timing analyzer can read in a mnemonic from the INF file and look up the respective opcode and end cycles from the file worst_stages_arm7tdmi.

Gem5 – As I understand It
Overview:

To my understanding, Gem5 is a simulator similar to uVision. Gem5 is compatible with the ARM Processor and has many different options. There were two things we hoped to gain from using Gem5: the amount of time it takes for a program to run (to calculate the number of cycles) and a look at the pipeline.


Installation:

I installed Gem5 on linux. After getting Linux on my computer, this is an overview of what I did:

I needed to update and install some stuff.

sudo apt-get update; sudo apt-get upgrade

sudo apt-get install (all of the following, I will separate with a comma) mercurial, scons, swig, gcc, m4, python, python-dev, libgoogle-perftools-dev, g++

After this, you will use mercurial to download the file from the web:

hg clone http://repo.gem5.org/gem5

Enter this new directory. Next you have to install gem5 using scons. This process does take about 20 minutes.

Cd gem5/ ; scons build/ARM/gem5.opt (options*)

*There may be some options that need to be included here for better results, but I never included any.

To install the debug options, repeat the following instructions, but instead of using the extension .opt it will be .debug:

scons build/ARM/gem5.debug


At this point, gem5 should be completely installed. I recommend running the test program first. This is how you simulate hello world:

build/ARM/gem5.opt configs/example/se.py -c tests/test-progs/hello/bin/arm/linux/hello

The syntax for running a program in gem5 is:

[gem5 options] [script options]

The easiest way to figure out what the different options are is to use the help flag which is: -h

For example:

build/ARM/gem5(.debug or .opt) -h

OR

build/ARM/gem5.opt configs/example/(choose one) -h


Typing these two lines will show you all of the different options that are possible. It is important to note that gem5 does not take an assembly file as input, it takes an executable or binary file.

In terms of running gem5 to get flat out results, this is about as far as I was able to get. I was able to start discovering some information about the pipeline viewer, or PipeView as it is called in gem5.

There is information on the web (See link at end of document) that discusses the difference between in order and out of order pipeline analysis. The PipeView is out of order. To my understanding, PipeView cannot be set to be in order, however there is a way to view the in order pipeline analysis. I'm just not sure how.

Here is what I was able to figure out about PipeView:

[see www.m5sim.org/Visualization for a picture of PipeView]

To run pipeview, use the following command:

build/ARM/gem5.opt --debug-flags=O3PipeView --trace-start= --trace-file=trace.out configs/example/se.py --cpu-type=detailed --caches -c
-m

I had to figure that you need to include the flag –cpu-type=detailed because on the website above, it was outdated and says -d. There are other possible options for the cpu-type (I believe there is one called arm_detailed).


After running this command, you have to produce the actual colored file with this command:

util/o3-pipeview.py -c 500 -o pipeview.out --color m5out/trace.out

This creates the file pipeview.out which can be read by:

less -r pipeview.out


I apologize that this is not a very complete guide, but I could not figure too much of it out because we started to run out of time and wanted to focus our attention on something to produce results.
For more information on gem5, you can visit the wiki page:
http://www.m5sim.org/Main_Page
Download 54.13 Kb.

Share with your friends:




The database is protected by copyright ©ininet.org 2024
send message

    Main page