To get a trial key
fill out the form below
Team License (a basic version)
Enterprise License (an extended version)
* By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
USD
EUR
RUB
* By clicking this button you agree to our Privacy Policy statement

Free PVS-Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
V1076. Code contains invisible characte…
Analyzer diagnostics
General Analysis (C++)
General Analysis (C#)
General Analysis (Java)
Diagnosis of micro-optimizations (C++)
Diagnosis of 64-bit errors (Viva64, C++)
Customer specific requests (C++)
MISRA errors
AUTOSAR errors
OWASP errors (C#)
Problems related to code analyzer
Additional information
Contents

V1076. Code contains invisible characters that may alter its logic. Consider enabling the display of invisible characters in the code editor.

Dec 07 2021

The analyzer has detected characters in code that may confuse the developer. These characters may be invisible and change the code representation in IDEs. Such character sequences may lead to the fact that the developer and the compiler would interpret the code differently.

This can be done on purpose. This type of attack is called Trojan Source. To learn more:

The analyzer issues a warning if it finds one of the following characters:

Character

Code

Definition

Description

LRE

U+202A

LEFT-TO-RIGHT EMBEDDING

The text after the LRE character is interpreted as inserted and displayed left-to-right. The action of LRE is interrupted by the PDF character or a newline character.

RLE

U+202B

RIGHT-TO-LEFT EMBEDDING

The text after the RLE character is interpreted as inserted and displayed right-to-left. The action of the RLE character is interrupted by the PDF character or a newline character.

LRO

U+202D

LEFT-TO-RIGHT OVERRIDE

The text after the LRO character is forcibly displayed left-to-right. The action of the LRO character is interrupted by the PDF character or a newline character.

RLO

U+202E

RIGHT-TO-LEFT OVERRIDE

The text after the RLO character is forcibly displayed right-to-left. The action of the RLO character is interrupted by the PDF character or a newline character.

PDF

U+202C

POP DIRECTIONAL FORMATTING

The PDF character interrupts the action of one of the LRE, RLE, LRO or RLO characters encountered earlier. Interrupts exactly one last character encountered.

LRI

U+2066

LEFT‑TO‑RIGHT ISOLATE

The text after the LRI symbol is displayed left-to-right and interpreted as isolated. This means that other control characters do not affect the display of this text fragment. The action of the LRI character is interrupted by the PDI character or a newline character.

RLI

U+2067

RIGHT‑TO‑LEFT ISOLATE

The text after the RLI symbol is displayed right-to-left and interpreted as isolated. This means that other control characters do not affect the display of this text fragment. The RLI action is interrupted by the PDI symbol or the newline symbol.

FSI

U+2068

FIRST STRONG ISOLATE

The direction of the text after the FSI character is set by the first control character not included in this text fragment. Other control characters do not affect the display of this text. The action of the FSI character is interrupted by the PDI character or a newline character.

PDI

U+2069

POP DIRECTIONAL ISOLATE

The PDI symbol interrupts the action of one of the LRI, RLI or FSI symbols encountered earlier. Interrupts exactly one last character encountered.

LRM

U+200E

LEFT-TO-RIGHT MARK

The text after the LRM character is displayed left-to-right. The LRM action is interrupted by a newline character.

RLM

U+200F

RIGHT-TO-LEFT MARK

The text after the RLM character is displayed right-to-left. The RLM action is interrupted by a newline character.

ALM

U+061C

ARABIC LETTER MARK

The text after the ALM character is displayed right-to-left. The ALM action is interrupted by a newline character.

ZWSP

U+200B

ZERO WIDTH SPACE

An invisible space character. The use of ZWSP character causes different strings to be displayed the same way. For example, 'str[ZWSP]ing' is displayed as 'string'.

Look at the following code fragment:

#include <iostream>

int main()
{
  bool isAdmin = false;
  /*[RLO] } [LRI] if (isAdmin)[PDI] [LRI] begin admins only */ // (1)
      std::cout << "You are an admin.\n";
  /* end admins only [RLO]{ [LRI]*/                            // (2)
  return 0;
}

Let's look closer at line (1).

[LRI] if (isAdmin)[PDI]

Here the [LRI] character has effect up to the [PDI] character. The 'if (isAdmin)' string is displayed left-to-right and is isolated. We get 'if (isAdmin)'.

[LRI] begin admins only */

Here the [LRI] character has effect up to the end of the string. We get an isolated string: 'begin admins only */'

[RLO] {space1}, '}', {space2}, 'if (isAdmin)', 'begin admins only */'

Here the [RLO] character has effect up to the end of the string and displays the text right-to-left. Each of the isolated strings obtained in the previous paragraphs is treated as a separate indivisible character. We get the following sequence:

'begin admins only */', 'if (isAdmin)', {space2}, '{', {space1}

Note that the closing brace character is now displayed as '{' instead of '}'.

The final view of line (1) that can be displayed in the editor:

/* begin admins only */ if (isAdmin) {

Similar transformations affect line (2), which is displayed like this:

/* end admins only */ }

The code fragment that can be displayed in the editor:

#include <iostream>

int main()
{
  bool isAdmin = false;
  /* begin admins only */ if (isAdmin) { 
      std::cout << "You are an admin.\n";
  /* end admins only */ }
  return 0;
}

The reviewer may think that the code is checked before displaying the message. They will ignore the comments and think that the code should be executed like this:

#include <iostream>

int main()
{
  bool isAdmin = false;
  if (isAdmin) { 
    std::cout << "You are an admin.\n";
  }
  return 0;
}

However, there is no check. For the compiler, the code above looks like this:

#include <iostream>

int main()
{
  bool isAdmin = false;
  std::cout << "You are an admin.\n";
  return 0;
}

Now let's look at a simple and at the same time dangerous example where non-displayed characters are used:

#include <string>
#include <string_view>

enum class BlockCipherType { DES, TripleDES, AES, /*....*/ };

constexpr BlockCipherType
StringToBlockCipherType(std::string_view str) noexcept
{
  if (str == "AES[ZWSP]")
    return BlockCipherType::AES;
  else if (str == "TripleDES[ZWSP]")
    return BlockCipherType::TripleDES;
  else
    return BlockCipherType::DES;
}

The 'StringToBlockCipherType' function converts a string to one of the values of the 'BlockCipherType' enumeration. You may think that the function returns three different values, but it doesn't. Since a invisible space character [ZWSP] is added at the end of each string literal, the check for equality with strings 'AES' and 'TriplesDES' will be false. As a result, out of three expected returned values, the function returns only 'BlockCipherType::DES'. At the same time, the code editor may display the code like this:

#include <string>
#include <string_view>

enum class BlockCipherType { DES, TripleDES, AES, /*....*/ };

constexpr BlockCipherType
StringToBlockCipherType(std::string_view str) noexcept
{
  if (str == "AES")
    return BlockCipherType::AES;
  else if (str == "TripleDES")
    return BlockCipherType::TripleDES;
  else
    return BlockCipherType::DES;
}

If the analyzer issued the warning about invisible characters in code, turn on the display of invisible characters. Make sure they don't change the logic of the program execution.

This diagnostic is classified as:

This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →
Accept