To get a trial key
fill out the form below
Team License (a basic version)
Enterprise License (extended version)
* By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
USD
EUR
GBP
RUB
* By clicking this button you agree to our Privacy Policy statement

Free PVS-Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
C++ Tail Recursion Using 64-bit variabl…

C++ Tail Recursion Using 64-bit variables - Part 2

Jun 24 2015

In my previous post I talked about recursion problems in a Fibonacci function using 64-bit variables as function parameters, compiled using the Microsoft Visual C++ compiler. It turned out that while tail recursion was enabled by the compiler using 32-bit types it didn't really when switching to 64-bit ones. Just as a reminder, Tail Recursion is an optimization performed by the compiler. It is the process of transforming certain types of tail calls into jumps instead of function calls.

My conclusion was that tail recursion is not handled properly by the Visual C++ compiler and a possible explanation could be the presence of a bug.

The calculation of Fibonacci sequences of big integers is not an everyday task but it can still be a reliable example to show how tail calls are implemented.

Not happy with my conclusions and following several suggestions of users' comments (here on the blog, on Reddit and on StackOverflow) I wanted to understand more about this issue and to explore other solutions using different compilers.

Let's pick again the Visual Studio project I used in my previous post and hosted on GitHub. This was the Fibonacci function used:

typedef  unsigned long long ULONG64;
ULONG64 fib_tail_x64(ULONG64 n, ULONG64 res, ULONG64 next)
{
  if (n == 0) {
    return res;
  }
  return fib_tail_x64(n - 1, next, res + next);
}

As you might recall, compiling it in Release mode (which is generally needed to enable tail recursion due to the compiler optimization command /Ox), and targeting Win32 platforms resulted in having tail recursion disabled:

0333_TailRecursion_Part2/image1.png

Tail Recursion is not Enabled! It's an ugly assembly though I didn't omit frame stack pointers

One thing I didn't try to do (well to be honest I actually completely forgot) was to pay attention to the solution platform chosen to build the project. In Visual Studio Win32, which means to compile the solution using x86 processor instructions, is the default configuration set. Looking carefully at the generated assembly in the picture above it can be seen that registers used EAX, EBX, etc are, accordingly with the configuration chosen, 32-bit processor registers.

Switching configuration to x64, building and running the project resulted in this assembly:

0333_TailRecursion_Part2/image2.png

Tail Recursion is back!!!

Surprise!! Tail recursion is back, x64 registers are used instead (RAX, RDX, etc.) and the assembly is much shorter and cleaner!

Summarizing what said so far for tail calls we can take the following table as a reference:

0333_TailRecursion_Part2/image3.png

It seems then that the problem is only "restricted" when a x86 platform is targeted and x64 variables used.

At this stage I honestly couldn't really say "it's definitely a bug in the MS compiler" so still not satisfied I went trying the same experiment using another compiler.

This time I decided to change operating system altogether and use Clang, installed on my Macbook as part of XCode and based on LLVM.

0333_TailRecursion_Part2/image4.png

Terminal - clang version

To view the generated assembly I used a handy tool called Hopper Disassembler, available on both OS X and Linux (but for gdb debugger would have sufficed too I guess).

I repeated the exact same experiment. The following image shows the first generated assembly:

0333_TailRecursion_Part2/image5.png

Tail Recursion is clearly enabled: JNE (Jump Not Equal) is just a conditional jump when ZF (the "zero" flag) is equal to 0. The red arrow on the left points at the address where the program execution jumps if the condition is true. The careful observer might have already noticed that this code is not compiled for 32-bit processors. The assembly registers are actually 64-bit (RBD, RDI, etc.).

Our goal here is to verify instead that targeting a 32-bit platform tail recursion would still be enabled.

It is possible to force the compiler to generate 32-bit code, using the compiler flag -m32. Building again the solution in 32-bit mode I obtained:

0333_TailRecursion_Part2/image6.png

Tail Recursion enabled!

Bingo!!! By looking at the type of registers we can state that the executable is targeting the correct architecture we chose, and look at the last instruction the surprisingly result: Tail Recursion is enabled for 32-bit too!!!!!

EDIT

It seemed that a lot of users could not reproduce this problem (see comments below and on Reddit too). I went playing with the compiler optimization options, tweaking some parameters. By default using Release mode this is the default configuration for optimization:

0333_TailRecursion_Part2/image7.png

Default optimization options

After a few attempts I changed 'Whole Program Optimization' from /GL (yes) to No. I changed 'Omit Frame Pointers' too as I was pointed out that the generated x86 assembly, with frame pointers, is way longer and uglier (and we probably save a few clock cycles avoiding some cache misses too). There is a nice explanation on StackOverflow if you want to read more about omitting frame pointers.

However, with those changes in mind, I recompiled the solution using Win32 and I got this surprising result:

0333_TailRecursion_Part2/image8.png

Tail recursion for x86. Worked by disabling 'Whole Program Optimization'

EDIT 2

A few users pointed out that regardless of WPO (Whole Program Optimization) they did not experience any issue with tail recursion. This has puzzled me for a while until I realized I was not paying attention to one important detail: the way I was calling the fibonacci function from the main. Since so far I've used this function call to test my x64 fibonacci:

ULONG64 res = fib_tail_x64(40, 0, 1);

So far so good, nothing special there. I'm just passing literals as parameters.

What if we use a variable, or instead a function pointer? Let's try this:

ULONG64 a = 40;
ULONG64 res = fib_tail_x64(a, 0, 1);

Although it doesn't look like a big change, calling the fib_tail_x64 using variables, it enables tail recursion regardless of WPO (as long as an optimization flag >= /O2 is used). A Reddit user highlighted that calling the function indirectly through pointers (regardless of using literals) would produce the same result:

volatile bool a = true;
ULONG64 res = 0;
ULONG64(*ptr)(ULONG64 n, ULONG64 res, ULONG64 next) = NULL;
if (a)
{
  ptr = &fib_tail_x64;
}
res = ptr(40, 0, 1);

Final Thoughts

Although I was initially sure that a possible explanation for the absence of tail recursion was the presence of a bug in the Visual C++ compiler it turned out it is not. The assemblies produced by both VC++ and Clang are very similar for x86 platforms. My first conclusion was to keep in mind is to disable 'Whole Program Optimization', if and only if you are calling the x64 recursive function directly and passing literals as parameters.

After a second analysis it emerged that for a few users this was not happening as they were calling the function directly using variables or indirectly through function pointers.

I would be still interested to hear some thoughts from somebody working on the Visual C++ compiler team to know the reason behind this "issue".

See you soon!

This article was originally published at Coding Adventures Blog. Republished by the author permission.

Popular related articles
PVS-Studio ROI

Date: Jan 30 2019

Author: Andrey Karpov

Occasionally, we're asked a question, what monetary value the company will receive from using PVS-Studio. We decided to draw up a response in the form of an article and provide tables, which will sho…
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, 10-15% of False Positives

Date: Jul 31 2017

Author: Andrey Karpov

After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many erro…
Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities

Date: Nov 21 2018

Author: Andrey Karpov

A brief description of technologies used in the PVS-Studio tool, which let us effectively detect a large number of error patterns and potential vulnerabilities. The article describes the implementati…
The way static analyzers fight against false positives, and why they do it

Date: Mar 20 2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
Free PVS-Studio for those who develops open source projects

Date: Dec 22 2018

Author: Andrey Karpov

On the New 2019 year's eve, a PVS-Studio team decided to make a nice gift for all contributors of open-source projects hosted on GitHub, GitLab or Bitbucket. They are given free usage of PVS-Studio s…
How PVS-Studio Proved to Be More Attentive Than Three and a Half Programmers

Date: Oct 22 2018

Author: Andrey Karpov

Just like other static analyzers, PVS-Studio often produces false positives. What you are about to read is a short story where I'll tell you how PVS-Studio proved, just one more time, to be more atte…
The Last Line Effect

Date: May 31 2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…
Appreciate Static Code Analysis!

Date: Oct 16 2017

Author: Andrey Karpov

I am really astonished by the capabilities of static code analysis even though I am one of the developers of PVS-Studio analyzer myself. The tool surprised me the other day as it turned out to be sma…
PVS-Studio for Java

Date: Jan 17 2019

Author: Andrey Karpov

In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've co…
The Ultimate Question of Programming, Refactoring, and Everything

Date: Apr 14 2016

Author: Andrey Karpov

Yes, you've guessed correctly - the answer is "42". In this article you will find 42 recommendations about coding in C++ that can help a programmer avoid a lot of errors, save time and effort. The au…

Comments (0)

Next comments
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →
Accept