Alexey Govorov

Jul 27 2020

Tags:

#Cpp #DevOps

PVS-Studio: analyzing pull requests in Azure DevOps using self-hosted agents

Jul 27 2020

Author: Alexey Govorov

Briefly about what we are dealing with
Preparation to using a self-hosted agent
Running analysis on a self-hosted agent
Some errors found in Minetest
Conclusion

Static code analysis is most effective when changing a project, as errors are always more difficult to fix in the future than at an early stage. We continue expanding the options for using PVS-Studio in continuous development systems. This time, we'll show you how to configure pull request analysis using self-hosted agents in Microsoft Azure DevOps, using the example of the Minetest game.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image1.png

Briefly about what we are dealing with

Minetest is an open-source cross-platform game engine containing about 200,000 lines of code in C, C++, and Lua. It allows you to create different game modes in voxel space. Supports multiplayer, and a lot of mods from community. The project repository is located here: https://github.com/minetest/minetest.

The following tools are used to configure regular error detection:

PVS-Studio is a static code analyzer of the code written in C, C++, C#, and Java to search for errors and security defects.

Azure DevOps is a cloud platform that allows you to develop, run applications, and store data on remote servers.

You can use Windows and Linux agent VMs to perform development tasks in Azure. However, running agents on the local equipment has several important advantages:

The local host may have more resources than an Azure VM;
The agent doesn't "disappear" after completing its task;
Ability to directly configure the environment and more flexible management of build processes;
Local storage of intermediate files has a positive effect on build speed;
You can complete more than 30 tasks per month for free.

Preparation to using a self-hosted agent

The process of getting started with Azure is described in detail in the article "PVS-Studio in the Clouds: Azure DevOps", so I will go straight to creating a self-hosted agent.

In order for agents to be able to connect to project pools, they need a special Access Token. You can get it on the "Personal Access Tokens" page, in the "User settings" menu.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image2.png

After clicking on "New token", you must specify a name and select Read & manage Agent Pools (you may need to expand the full list via "Show all scopes").

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image3.png

You need to copy the token, because Azure will not show it again, and you will have to make a new one.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image4.png

A Docker container based on Windows Server Core will be used as the agent. The host is my desktop computer on Windows 10 x64 with Hyper-V.

First, you will need to expand the amount of disk space available to Docker containers.

To do this, in Windows, you need to modify the file 'C:\ProgramData\Docker\config\daemon.json' as follows:

{
  "registry-mirrors": [],
  "insecure-registries": [],
  "debug": true,
  "experimental": false,
  "data-root": "d:\\docker",
  "storage-opts": [ "size=40G" ]
}

To create a Docker image for agents with the build system and everything necessary, let's add a Docker file with the following content in the directory 'D:\docker-agent':

# escape=`

FROM mcr.microsoft.com/dotnet/framework/runtime

SHELL ["cmd", "/S", "/C"]

ADD https://aka.ms/vs/16/release/vs_buildtools.exe C:\vs_buildtools.exe
RUN C:\vs_buildtools.exe --quiet --wait --norestart --nocache `
  --installPath C:\BuildTools `
  --add Microsoft.VisualStudio.Workload.VCTools `
  --includeRecommended

RUN powershell.exe -Command `
  Set-ExecutionPolicy Bypass -Scope Process -Force; `
  [System.Net.ServicePointManager]::SecurityProtocol =
    [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; `
  iex ((New-Object System.Net.WebClient)
    .DownloadString('https://chocolatey.org/install.ps1')); `
  choco feature enable -n=useRememberedArgumentsForUpgrades;
  
RUN powershell.exe -Command `
  choco install -y cmake --installargs '"ADD_CMAKE_TO_PATH=System"'; `
  choco install -y git --params '"/GitOnlyOnPath /NoShellIntegration"'

RUN powershell.exe -Command `
  git clone https://github.com/microsoft/vcpkg.git; `
  .\vcpkg\bootstrap-vcpkg -disableMetrics; `
  $env:Path += '";C:\vcpkg"'; `
  [Environment]::SetEnvironmentVariable(
    '"Path"', $env:Path, [System.EnvironmentVariableTarget]::Machine); `
  [Environment]::SetEnvironmentVariable(
    '"VCPKG_DEFAULT_TRIPLET"', '"x64-windows"',
  [System.EnvironmentVariableTarget]::Machine)

RUN powershell.exe -Command `
  choco install -y pvs-studio; `
  $env:Path += '";C:\Program Files (x86)\PVS-Studio"'; `
  [Environment]::SetEnvironmentVariable(
    '"Path"', $env:Path, [System.EnvironmentVariableTarget]::Machine)

RUN powershell.exe -Command `
  $latest_agent =
    Invoke-RestMethod -Uri "https://api.github.com/repos/Microsoft/
                          azure-pipelines-agent/releases/latest"; `
  $latest_agent_version =
    $latest_agent.name.Substring(1, $latest_agent.tag_name.Length-1); `
  $latest_agent_url =
    '"https://vstsagentpackage.azureedge.net/agent/"' + $latest_agent_version +
  '"/vsts-agent-win-x64-"' + $latest_agent_version + '".zip"'; `
  Invoke-WebRequest -Uri $latest_agent_url -Method Get -OutFile ./agent.zip; `
  Expand-Archive -Path ./agent.zip -DestinationPath ./agent

USER ContainerAdministrator
RUN reg add hklm\system\currentcontrolset\services\cexecsvc
        /v ProcessShutdownTimeoutSeconds /t REG_DWORD /d 60  
RUN reg add hklm\system\currentcontrolset\control
        /v WaitToKillServiceTimeout /t REG_SZ /d 60000 /f

ADD .\entrypoint.ps1 C:\entrypoint.ps1
SHELL ["powershell", "-Command",
       "$ErrorActionPreference = 'Stop';
     $ProgressPreference = 'SilentlyContinue';"]
ENTRYPOINT .\entrypoint.ps1

The result is a build system based on MSBuild for C++, with Chocolatey for installing PVS-Studio, CMake, and Git. Vcpkg is built for convenient management of the libraries that the project depends on. Also, we have to download the latest version of the Azure Pipelines Agent.

To initialize the agent from the ENTRYPOINT Docker file, the PowerShell script 'entrypoint.ps1' is called, to which you need to add the URL of the project's "organization", the token of the agent pool, and the PVS-Studio license parameters:

$organization_url = "https://dev.azure.com/<Microsoft Azure account>"
$agents_token = "<agent token>"

$pvs_studio_user = "<PVS-Studio user name>"
$pvs_studio_key = "<PVS-Studio key>"

try
{
  C:\BuildTools\VC\Auxiliary\Build\vcvars64.bat

  PVS-Studio_Cmd credentials -u $pvs_studio_user -n $pvs_studio_key
  
  .\agent\config.cmd --unattended `
    --url $organization_url `
    --auth PAT `
    --token $agents_token `
    --replace;
  .\agent\run.cmd
} 
finally
{
  # Agent graceful shutdown
  # https://github.com/moby/moby/issues/25982
  
  .\agent\config.cmd remove --unattended `
    --auth PAT `
    --token $agents_token
}

Commands for building an image and starting the agent:

docker build -t azure-agent -m 4GB .
docker run -id --name my-agent -m 4GB --cpu-count 4 azure-agent

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image5.png

The agent is running and ready to perform tasks.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image6.png

Running analysis on a self-hosted agent

For PR analysis, a new pipeline is created with the following script:

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image7.png

trigger: none

pr:
  branches:
    include:
    - '*'

pool: Default

steps:
- script: git diff --name-only
    origin/%SYSTEM_PULLREQUEST_TARGETBRANCH% >
    diff-files.txt
  displayName: 'Get committed files'

- script: |
    cd C:\vcpkg
    git pull --rebase origin
    CMD /C ".\bootstrap-vcpkg -disableMetrics"
    vcpkg install ^
    irrlicht zlib curl[winssl] openal-soft libvorbis ^
    libogg sqlite3 freetype luajit
    vcpkg upgrade --no-dry-run
  displayName: 'Manage dependencies (Vcpkg)'

- task: CMake@1
  inputs:
    cmakeArgs: -A x64
      -DCMAKE_TOOLCHAIN_FILE=C:/vcpkg/scripts/buildsystems/vcpkg.cmake
      -DCMAKE_BUILD_TYPE=Release -DENABLE_GETTEXT=0 -DENABLE_CURSES=0 ..
  displayName: 'Run CMake'

- task: MSBuild@1
  inputs:
    solution: '**/*.sln'
    msbuildArchitecture: 'x64'
    platform: 'x64'
    configuration: 'Release'
    maximumCpuCount: true
  displayName: 'Build'

- script: |
    IF EXIST .\PVSTestResults RMDIR /Q/S .\PVSTestResults
    md .\PVSTestResults
    PVS-Studio_Cmd ^
    -t .\build\minetest.sln ^
    -S minetest ^
    -o .\PVSTestResults\minetest.plog ^
    -c Release ^
    -p x64 ^
    -f diff-files.txt ^
    -D C:\caches
    PlogConverter ^
    -t FullHtml ^
    -o .\PVSTestResults\ ^
    -a GA:1,2,3;64:1,2,3;OP:1,2,3 ^
    .\PVSTestResults\minetest.plog
    IF NOT EXIST "$(Build.ArtifactStagingDirectory)" ^
    MKDIR "$(Build.ArtifactStagingDirectory)"
    powershell -Command ^
    "Compress-Archive -Force ^
    '.\PVSTestResults\fullhtml' ^
    '$(Build.ArtifactStagingDirectory)\fullhtml.zip'"
  displayName: 'PVS-Studio analyze'
  continueOnError: true

- task: PublishBuildArtifacts@1
  inputs:
    PathtoPublish: '$(Build.ArtifactStagingDirectory)'
    ArtifactName: 'psv-studio-analisys'
    publishLocation: 'Container'
  displayName: 'Publish analysis report'

This script will work when a PR is received and will be executed on the agents assigned to the pool by default. You only need to give it a permission to work with this pool.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image8.png

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image9.png

The script saves the list of modified files obtained using git diff. Then the dependencies are updated, the project solution is generated via CMake, and it is built.

If the build was successful, analysis of the changed files is started (the flag '-f diff-files.txt'), ignoring the auxiliary projects created by CMake (select only the necessary project with the '-S minetest ' flag). To make determining relations between header and source C++ files faster, a special cache is created, which will be stored in a separate directory (the flag '-D C:\caches').

This way we can now get reports on analyzing changes in the project.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image10.png

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image11.png

As mentioned at the beginning of the article, a nice bonus of using self-hosted agents is a noticeable acceleration of task execution, due to local storage of intermediate files.

0751_Analyzing_Pull_Requests_In_Azure_DevOps/image12.png

Some errors found in Minetest

Overwriting the result

V519 The 'color_name' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 621, 627. string.cpp 627

static bool parseNamedColorString(const std::string &value,
                                  video::SColor &color)
{
  std::string color_name;
  std::string alpha_string;

  size_t alpha_pos = value.find('#');
  if (alpha_pos != std::string::npos) {
    color_name = value.substr(0, alpha_pos);
    alpha_string = value.substr(alpha_pos + 1);
  } else {
    color_name = value;
  }

  color_name = lowercase(value); // <=

  std::map<const std::string, unsigned>::const_iterator it;
  it = named_colors.colors.find(color_name);
  if (it == named_colors.colors.end())
    return false;
  ....
}

This function should parse the color name with the transparency parameter (for example, Green#77) and return its code. Depending on the result of checking the condition, the color_name variable is passed the result of splitting the string or a copy of the function argument. However, the original argument is then converted to lowercase instead of the resulting string itself. As a result, it can't be found in the color dictionary if the transparency parameter is present. We can fix this line like this:

color_name = lowercase(color_name);

Redundant checks of conditions

V547 Expression 'nearest_emergefull_d == - 1' is always true. clientiface.cpp 363

void RemoteClient::GetNextBlocks (....)
{
  ....
  s32 nearest_emergefull_d = -1;
  ....
  s16 d;
  for (d = d_start; d <= d_max; d++) {
    ....
      if (block == NULL || surely_not_found_on_disk || block_is_invalid) {
        if (emerge->enqueueBlockEmerge(peer_id, p, generate)) {
          if (nearest_emerged_d == -1)
            nearest_emerged_d = d;
        } else {
          if (nearest_emergefull_d == -1) // <=
            nearest_emergefull_d = d;
          goto queue_full_break;
        }
  ....
  }
  ....
queue_full_break:
  if (nearest_emerged_d != -1) { // <=
    new_nearest_unsent_d = nearest_emerged_d;
  } else ....
}

The nearest_emergefull_d variable doesn't change during the loop operation, and its checking doesn't affect the algorithm execution progress. Either this is the result of a sloppy copy-paste, or they forgot to perform some calculations with it.

V560 A part of conditional expression is always false: y > max_spawn_y. mapgen_v7.cpp 262

int MapgenV7::getSpawnLevelAtPoint(v2s16 p)
{
  ....
  while (iters > 0 && y <= max_spawn_y) {               // <=
    if (!getMountainTerrainAtPoint(p.X, y + 1, p.Y)) {
      if (y <= water_level || y > max_spawn_y)          // <=
        return MAX_MAP_GENERATION_LIMIT; // Unsuitable spawn point

      // y + 1 due to biome 'dust'
      return y + 1;
    }
  ....
}

The value of the 'y' variable is checked before the next iteration of the loop. A subsequent, opposite comparison will always return false and actually doesn't affect the result of checking the condition.

Missed pointer check

V595 The 'm_client' pointer was utilized before it was verified against nullptr. Check lines: 183, 187. game.cpp 183

void gotText(const StringMap &fields)
{
  ....
  if (m_formname == "MT_DEATH_SCREEN") {
    assert(m_client != 0);
    m_client->sendRespawn();
    return;
  }

  if (m_client && m_client->modsLoaded())
    m_client->getScript()->on_formspec_input(m_formname, fields);
}

Before accessing the m_client pointer, it is checked whether it is null using the assert macro. But this only applies to the debug build. So, this precautionary measure is replaced with a dummy when building to release, and there is a risk of dereferencing the null pointer.

Bit or not bit?

V616 The '(FT_RENDER_MODE_NORMAL)' named constant with the value of 0 is used in the bitwise operation. CGUITTFont.h 360

typedef enum  FT_Render_Mode_
{
  FT_RENDER_MODE_NORMAL = 0,
  FT_RENDER_MODE_LIGHT,
  FT_RENDER_MODE_MONO,
  FT_RENDER_MODE_LCD,
  FT_RENDER_MODE_LCD_V,

  FT_RENDER_MODE_MAX
} FT_Render_Mode;

#define FT_LOAD_TARGET_( x )   ( (FT_Int32)( (x) & 15 ) << 16 )
#define FT_LOAD_TARGET_NORMAL  FT_LOAD_TARGET_( FT_RENDER_MODE_NORMAL )

void update_load_flags()
{
  // Set up our loading flags.
  load_flags = FT_LOAD_DEFAULT | FT_LOAD_RENDER;
  if (!useHinting()) load_flags |= FT_LOAD_NO_HINTING;
  if (!useAutoHinting()) load_flags |= FT_LOAD_NO_AUTOHINT;
  if (useMonochrome()) load_flags |= 
    FT_LOAD_MONOCHROME | FT_LOAD_TARGET_MONO | FT_RENDER_MODE_MONO;
  else load_flags |= FT_LOAD_TARGET_NORMAL; // <=
}

The FT_LOAD_TARGET_NORMAL macro is deployed to zero, and the bitwise "OR" will not set any flags in load_flags, the else branch can be removed.

Rounding integer division

V636 The 'rect.getHeight() / 16' expression was implicitly cast from 'int' type to 'float' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double)(X) / Y;. hud.cpp 771

void drawItemStack(....)
{
  float barheight = rect.getHeight() / 16;
  float barpad_x = rect.getWidth() / 16;
  float barpad_y = rect.getHeight() / 16;

  core::rect<s32> progressrect(
    rect.UpperLeftCorner.X + barpad_x,
    rect.LowerRightCorner.Y - barpad_y - barheight,
    rect.LowerRightCorner.X - barpad_x,
    rect.LowerRightCorner.Y - barpad_y);
}

Rect getters return integer values. The result of dividing integer numbers is written to a floating-point variable, and the fractional part gets lost. It looks like there are mismatched data types in these calculations.

Suspicious sequence of branching operators

V646 Consider inspecting the application's logic. It's possible that 'else' keyword is missing. treegen.cpp 413

treegen::error make_ltree(...., TreeDef tree_definition)
{
  ....
  std::stack <core::matrix4> stack_orientation;
  ....
    if ((stack_orientation.empty() &&
      tree_definition.trunk_type == "double") ||
      (!stack_orientation.empty() &&
      tree_definition.trunk_type == "double" &&
      !tree_definition.thin_branches)) {
      ....
    } else if ((stack_orientation.empty() &&
      tree_definition.trunk_type == "crossed") ||
      (!stack_orientation.empty() &&
      tree_definition.trunk_type == "crossed" &&
      !tree_definition.thin_branches)) {
      ....
    } if (!stack_orientation.empty()) {                  // <=
  ....
  }
  ....
}

There are else-if sequences in the tree generation algorithm here. In the middle the next if block is on the same line with the closing brace of the previous else statement. Perhaps, the code works correctly: before this if statement, blocks of the trunk are created, followed by leaves. On the other hand, it's possible that else is missed. Only the author can say this for sure.

Incorrect memory allocation check

V668 There is no sense in testing the 'clouds' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. game.cpp 1367

bool Game::createClient(....)
{
  if (m_cache_enable_clouds) {
    clouds = new Clouds(smgr, -1, time(0));
    if (!clouds) {
      *error_message = "Memory allocation error (clouds)";
      errorstream << *error_message << std::endl;
      return false;
    }
  }
}

If new can't create an object, an std::bad_alloc exception is thrown, and it must be handled by the try-catch block. A check like this is useless.

Reading outside the array bound

V781 The value of the 'i' index is checked after it was used. Perhaps there is a mistake in program logic. irrString.h 572

bool equalsn(const string<T,TAlloc>& other, u32 n) const
{
  u32 i;
  for(i=0; array[i] && other[i] && i < n; ++i) // <=
    if (array[i] != other[i])
      return false;

  // if one (or both) of the strings was smaller then they
  // are only equal if they have the same length
  return (i == n) || (used == other.used);
}

Array elements are accessed before checking the index, which may lead to an error. Perhaps the author should rewrite the loop like this:

for (i=0; i < n; ++i) // <=
  if (!array[i] || !other[i] || array[i] != other[i])
    return false;

Other errors

This article covers the analysis of pull requests in Azure DevOps and doesn't aim to provide a detailed overview of errors found in the Minetest project. Only some code fragments that I found interesting are written here. We suggest that the project authors don't follow this article to correct errors, but perform a more thorough analysis of the warnings that PVS-Studio will issue.

Conclusion

Thanks to its flexible command-line configuration, PVS-Studio analysis can be integrated into a wide variety of CI/CD scenarios. And the correct use of available resources pays off by increasing productivity.

Note that the pull request checking mode is only available in the Enterprise version of the analyzer. To get a demo Enterprise license, specify this in the comments when requesting a license on the download page. You can learn more about the difference between licenses on the Buy PVS-Studio page.

#Cpp #DevOps