[In this reprinted #altdevblogaday opinion piece, Gamer Camp’s Alex Darby explains how to set up Visual Studio to look at optimized assemblycode generated for simple code snippets.] It’s that time again where I have managed to find a few spare hoursto squoze out an article for the Low Level Curriculum. This is theeighth post in this series, which is not in any way significantexcept that I like the number 8. As well as being a power of two,it is also the maximum number of unarmed people who cansimultaneously get close enough to attack you (according to amartial arts book I once read). This post covers how to set up Visual Studio to allow you to easilylook at the optimized assembly code generated for simple codesnippets like the ones we deal with in this series.
If you wonder why I feel this is worth a post of its own, here’sthe reason – optimizing compilers are good, and given codewith constants as input and no external output (like the snippets Igive as examples in this series) the compiler will generallyoptimize the code away to nothing – which I find makes itpretty hard to look at. This should prove immensely useful, both torefer back to, and for your own experimentation. Here are the backlinks for preceding articles in the series in caseyou want to refer back to any of them (warning: the first few arequite long): Here are the backlinks for preceding articles in the series(warning: it might take you a while, the first few are quite long): A Low Level Curriculum for C and C++ C / C++ Low Level Curriculum part 2: Data Types C / C++ Low Level Curriculum Part 3: The Stack C / C++ Low Level Curriculum: More Stack C / C++ Low Level Curriculum Part 5: Even More Stack C / C++ Low Level Curriculum Part 6: Conditionals C / C++ Low Level Curriculum Part 6: More Conditionals Assumptions Strictly speaking, dear reader, I am making tons of assumptionsabout you as I write this – that you read English, that youlike to program etc. but we’ll be here all day if I try to listthose so let’s stick to the ones that might be immediatelyinconvenient if they were incorrect.
I will be assuming that you have access to some sub-species ofVisual Studio 2010 on a Windows PC, and that you are familiar withusing it to do all the everyday basics like change buildconfigurations, open files, edit, compile, run, and debug C/C++. Creating a project Open Visual Studio and from the menu choose “File – New – Project…”. Once the new project wizard window opens (see below): go to the tree view on the left of the window and select “OtherLanguages – Visual C++” in the main pane select “Win32 Console Application Visual C++” give it a name in the Name edit box browse for a location of your choosing on your PC click OK to create the project Once you have clicked OK just click “Finish” on the next stage ofthe wizard – in case you’re wondering, the options availablewhen you click next don’t matter for our purposes (and un-checkingthe “Precompiled header” check box makes no difference, it stillgenerates a console app that uses a precompiled header…). Changing the Project Properties The next step is to use the menu to select “Project – YourProjectName Properties”, which will bring up the properties dialog for theproject. When the properties dialog appears (see image below): select “All Configurations” from the Configuration drop list select “Configuration Properties – General” in the tree view atthe left of the window in the main pane change “Whole Program Optimization” to “No WholeProgram Optimization”. Replace iPhone LCD Screen
Next, in the tree view (see image below): in the tree view, navigate to “C/C++ – Code Generation” in the main pane, change “Basic Runtime Checks” to “Default” (i.e.off) Finally (see image below): in the tree view, go to “C/C++ – Output Files” in the main pane change “Assembler Output” to “Assembly With SourceCode /(FAs)” once you’ve done that click “OK” Now, when you compile the Visual Studio compiler will generate an .asm file as well as an .exe file. This file will contain the intermediate assembly codegenerated by the compiler, with the source code inserted into itinline as comments. You could alternatively choose the “Assembly, Machine Code andSource (/FAcs)” option if you like – this will generate a.cod file that contains the machine code as well as the asm andsource. I prefer the regular .asm because it’s less visually noisy and theassembler mnemonics are all aligned on the same column, so that’swhat I’ll assume you’re using if you’re following the article, butthe .cod file is fine. Portable USB Power Bank Manufacturer
So, what did we do there? Well, first we turned off link time code generation. Amongst otherthings, this will prevent the linker stripping the .asm generatedfor functions that are compiled but not called anywhere. Secondly, we turned off the basic runtime checks (which are alreadyoff in Release). These checks make the function prologues andepilogues generated do significant amounts of (basicallyunnecessary) extra work causing a worst case 5x slowdown (see this post by Bruce Dawson on his personal blog for an in depth explanation). iPhone Touch Screen Digitizer
Finally, we asked the compiler not to throw away the assembly codeit generates for our program; this data is produced by thecompilation process whenever you compile but is usually thrownaway, we’re just asking Visual Studio to write it into an .asm file so we can take a look at it. Since we made these changes for “All Configurations” this means wewill have access to .asm files containing the assembly code generated by both the Debug andRelease build configurations. Let’s try it out So in the spirit of discovery, let’s try it out (for the sake offamiliarity) with a language feature we looked at last time –the conditional operator: The question you have in your head at this moment should be “whyhave we put the code into a function?”. Rest assured that this willbecome apparent soon enough. Now we have to build the code and look in the .asm files generated to see what the compiler has been up to… First build the Debug build configuration – this shouldalready be selected in the solution configuration drop-down (at thetop of your Visual Studio window unless you’ve moved it).
Next build the Release configuration. Now we need to open the .asm files. Unless you have messed withproject settings that I didn’t tell you to these will be in thefollowing paths: path where you put the project /Debug/ projectName .asm path where you put the project /Release/ projectName .asm .asm files I’m not going to go into any significant detail about how .asm files are laid out here, if you want to find out more here’s alink to the Microsoft documentation for their assembler . The main thing you should note is that we can find the C/C++functions in the .asm file by looking for their names; and that – once we findthem – the mixture of source code and assembly code looksbasically the same as it does in the disassembly view of VisualStudio in the debugger.
main() Let’s look at main() first. This is where I explain why the codesnippet we wanted to look at was put in a function. I can tellyou’re excited. Here’s main() from the Debug .asm (I’ve reformatted it slightly tomake it take up less vertical space): As long as you’ve read the previous posts, this should mostly lookpretty familiar.
It breaks down as follows: lines 1-8: these lines define the offsets of the various Stackvariables from [ebp] within main()’s Stack Frame lines 10-15: function prologue of main() lines 17-20: initialize the Stack variables lines 22-30: push the parameters to ConditionalTest() into theStack, call it, and assign its return value line 32: sets up main()’s return value lines 34-38: function epilogue of main() line 39: return from main() Nothing unexpected there really, the only new thing to take in isthe declarations of the Stack variable offsets from [ebp] . I feel these tend to make the assembly code easier to follow thanthe code in the disassembly window in the Visual Studio debugger. And, for comparison, here’s main() for the Release .asm: The astute amongst you will have noticed that the Release assemblycode is significantly smaller than the Debug. In fact, it’s clearly doing nothing at all other than returning 0.Good optimizing! High five! As I alluded to earlier, the optimizing compiler is great atspotting code that evaluates to a compile time constant and willhappily replace any code it can with the equivalent constant. So that’s why we put the code snippet in a function It should hopefully be relatively clear by this point why we mighthave put the code snippet into a function, and then asked thelinker not to remove code for functions that aren’t called.
Even if it can optimize away calls to a function, the compilercan’t optimize away the function before link time because some codeoutside of the object file it exists in might call it. Incidentally, the same effect usually keeps variablesdefined at global scope from being optimized away before linkage. I’m going to call this Schrödinger linkage (catchy, right?). If we want our simple code snippet to stayaround after optimizing we only need to make sure that it takesadvantage of Schrödinger linkage to cheat the optimizer.
If the compiler can’t tell whether the function will be called,then it certainly can’t tell what the values of its parameters willbe during one of these potential calls, or what its return valuemight be used for and so it can’t optimize away any code thatrelies on those inputs or contributes to the output either. The upshot of this is that if we put our code snippet in afunction, make sure that it uses the function parameters as inputs,and that its output is returned from the function then it shouldsurvive optimization. It’s really a testament to all the compiler programmers over theyears that it takes so much effort to get at the optimized assemblycode generated by a simple code snippet – compilerprogrammers we salute you! ConditionalTest() So, here’s the Debug .asm for ConditionalTest() (ignoring theprologue / epilogue): As you should be able to see, this is doing the basically samething as the code we looked at in the Debug disassembly in the previous article : branching based on the result of test ing the value of bFlag (the mnemonic test does a bitwise logical AND) both branches set a Stack variable at an offset of tv66 from [ebp] and both branches then execute the last line which copies thecontent of that address into eax Again, the assembly code is arguably easier to follow than thecorresponding disassembly because the jmp mnemonic jumps to labels visibly defined in the code, whereas inthe disassembly view in Visual Studio you generally have to crossreference the operand to jmp with the memory addresses in the disassembly view to see whereit’s j u mp ing to… Let’s compare this with the Release assembler (again not showingthe function prologue or epilogue): You will note that the work of this function is now done in 4instructions as opposed to 9 in the Debug: it c o mp ares the value of bFlag against 0 unconditionally mov es the value of iOnTrue into eax if the value of bFlag was not equal to 0 (i.e. it was true) itjumps past the next instruction… …otherwise this mov es the value of iOnFalse into eax As I’ve stated before I’m not an assembly code programmer and I’mnot an optimization expert.
Consequently, I’m not going to offer myopinion on the significance of the ordering of the instructions inthis Release assembly code. I am, however, prepared to go out on a limb and say it’s a prettysafe bet that the Release version with 4 instructions is going toexecute significantly faster than the Debug version with 9. So, why such a big difference between Debug and Release forsomething that when debugging at source level is a single-step? Essentially this is because the unoptimized assembly code generatedby the compiler must be amenable to single-step debugging at thesource level: it almost always does the exact logical equivalent of what the highlevel code asked it to do and, specifically, in the same order it also has to frequently write values from CPU registers back intomemory so that the debugger can show them updating Summary What’s the main point I’d like you to take away from this article?Optimizing compilers are feisty! You have to know how to stop them optimizing away your isolatedC/C++ code snippets if you want to easily be able to see theoptimized assembly code they generate. This article shows a simple boilerplate way to short-circuit theVisual Studio optimizing compiler – mileage will vary onother platforms.
There are other strategies to stop the optimizer optimizing awayyour code, but they basically all come down to utilizing theSchrödinger linkage effect; in general: use global variables, function parameters, or function call resultsas inputs to the code use global variables, function return values, or function callparameters as outputs from the code if you’re not using Visual Studio’s compiler you may also need toturn off inlining A final extreme method I have been told about is to insert nop instructions via inline assembly around / within the code you wantto isolate. Note that you should use this approach with caution, asit interferes directly with the optimizer and can easily affect theoutput to the point where it is no longer representative. Epilogue So, I hope you found this interesting – I certainly expectyou will find it useful 🙂 The next article (as promised last time!) is about looping, whichis another reason why it seemed like a good time to cover gettingat optimized assembly code for simple C/C++ snippets. I will be referring back to this in future articles in situationswhere looking at the optimized assembly code is particularlyrelevant.
If you’re wondering what you should look at first to see how Debugand Release code differ, and want to get practise at beating theoptimizer, I’d suggest starting with something straight forwardlike adding a few numbers together. Lastly, but by no means leastly, thanks to Rich, Ted, and Bruce fortheir input and proof reading. [This piece was reprinted from #AltDevBlogADay , a shared blog initiative started by @mike_acton devoted to giving game developers of all disciplines a place tomotivate each other to write regularly about their personal gamedevelopment passions.] Related news: Maximum creativity: Open and Closed Mode In-depth: Generating uniformly distributed points on a sphere In-depth: SQL server performance.