Our website uses cookies to enhance your browsing experience.
Accept
to the top
>
>
>
Exploring Nau Engine codebase: pt.2

Exploring Nau Engine codebase: pt.2

Feb 21 2025

Welcome to the second part of our Nau Engine trilogy, where we delve into key aspects of optimization and performance enhancement. Our goal is to discover issues that could impact the performance and stability of games powered by Nau Engine.

In the first part, we focused on the engine's functionality, highlighting three key error blocks: memory issues, copy-paste, and logic errors. Today we talk about performance: it also plays an important role in development. Now, let's shift our focus to the analysis results with PVS-Studio.

Fragment N1

std::vector<AnimationTargetData> m_targets;

void AnimationComponent::addAnimationTarget(IAnimationTarget::Ptr target)
{
  if (target)
  {
    if (auto* nauObject = target->as<scene::NauObject*>())
    {
      ....
      m_targets.push_back(AnimationTargetData(std::move(wrapper), nullptr));
    }
    else
    {
      m_targets.push_back(AnimationTargetData(std::move(target), nullptr));
    }
  }
}

The PVS-Studio warning: V823 Decreased performance. Object may be created in-place in the 'm_targets' container. Consider replacing methods: 'push_back' -> 'emplace_back'. animation_component.cpp 180

In the code, push_back is used to add objects to the m_targets vector, which leads to the creation of a temporary object that is then copied or moved into a container. This causes a superfluous call to the move/copy constructor. To eliminate intermediate steps and create the object directly within the container, it'd be better to replace push_back with emplace_back. The last one passes constructor arguments directly to the vector, creating the object "in-place" without extra overhead.

Fragment N2

void writeContainerHeader(....)
{
    ....
    const eastl::string contentLength = eastl::to_string(serializedData.size());

    Vector<eastl::tuple<eastl::string, eastl::string>> httpHeader = {
        {"NauContent-Kind", eastl::string(kind)},
        {"Content-Type", "application/json"},
        {"Content-Length", std::move(contentLength)}
    };
    ....
}

The PVS-Studio warning: V833 Passing the const-qualified object 'contentLength' to the 'std::move' function disables move semantics. nau_container.cpp 68

The contentLength object of the eastl::string type is declared with the const qualifier. Developers might have intended to move it into the httpHeader vector via std::move. Unfortunately, the move operation doesn't occur because the eastl::string constructor overload with the rvalue reference couldn't be selected. Instead, the overload resolution favors the copy constructor.

The fix is quite straightforward: remove the const qualifier from contentLength, allowing the object to be moved.

But that's not all—there's another implicit string copy in this code. When httpHeader is declared, the std::initializer_list constructor is called. Since std::initializer_list is a lightweight proxy object over the const T array, the compiler effectively rewrites the code as follows:

void writeContainerHeader(....)
{
    ....
    eastl::string contentLength = eastl::to_string(serializedData.size());

    const eastl::tuple<eastl::string, eastl::string> backing_array[] {
        {"NauContent-Kind", eastl::string(kind)},
        {"Content-Type", "application/json"},
        {"Content-Length", std::move(contentLength)}      
    };

    Vector<eastl::tuple<eastl::string, eastl::string>> httpHeader {
      std::initializer_list { backing_array }
    };
    ....
}

Based on this, constant tuples will be copied into the vector, leading to superfluous string copies within the constant array. It'd be better to use emplace_back calls instead of std::initializer_list:

void writeContainerHeader(....)
{
    ....
    eastl::string contentLength = eastl::to_string(serializedData.size());

    Vector<eastl::tuple<eastl::string, eastl::string>> httpHeader;
    httpHeader.reserve(3);

    httpHeader.emplace_back("NauContent-Kind", eastl::string(kind));
    httpHeader.emplace_back("Content-Type", "application/json");
    httpHeader.emplace_back("Content-Length", std::move(contentLength));
}

Such code will indeed be more high-performance, though at the expense of brevity.

Fragment N3

inline const ComponentInfo getEntityComponentInfo(....) const
{
  EntityComponentRef ref = getEntityComponentRef(eid, cid);
  if (ref.isNull())
    return ComponentInfo("<invalid>", eastl::move(ref));
  return ComponentInfo(dataComponents
                       .getComponentNameById(ref.getComponentId()),
                       eastl::move(ref));
}

The PVS-Studio warning: V839 The 'EntityManager::getEntityComponentInfo' function returns a constant value. This may interfere with move semantics. entityManager.h 1855

The getEntityComponentInfo function returns objects of the const ComponentInfo type. However, it's not recommended to return const-qualified objects from functions (see CppCoreGuidelines F.49). Before C++17, it could lead to redundant copies when the object is initialized with the function result.

The fix this, we can just remove const from the return type.

inline ComponentInfo getEntityComponentInfo(....) const
{
  ....
}

Here are other cases:

  • V839 The 'getCommit' function returns a constant value. This may interfere with move semantics. engine_version.h 54
  • V839 The 'getBranch' function returns a constant value. This may interfere with move semantics. engine_version.h 59
  • V839 The 'getFullPathCache' function returns a constant value. This may interfere with move semantics. CCFileUtils.h 831
  • V839 The 'FileUtils::getSearchResolutionsOrder' function returns a constant value. This may interfere with move semantics. CCFileUtils.cpp 971
  • V839 The 'FileUtils::getSearchPaths' function returns a constant value. This may interfere with move semantics. CCFileUtils.cpp 977
  • V839 The 'FileUtils::getOriginalSearchPaths' function returns a constant value. This may interfere with move semantics. CCFileUtils.cpp 983
  • V839 The 'FileUtils::getDefaultResourceRootPath' function returns a constant value. This may interfere with move semantics. CCFileUtils.cpp 995
  • V839 The 'ProgramNau::getActiveAttributes' function returns a constant value. This may interfere with move semantics. program_nau.cpp 99

Fragment N4

eastl::unique_ptr<uint8_t[]> convertData(....)
{
  switch (format)
  {
  case cocos2d::backend::PixelFormat::RGBA4444:
  {
    ....
    eastl::unique_ptr<uint8_t[]> newData{ new uint8_t[....] };

    ....
    return std::move(newData);
  }
  ....
}

The PVS-Studio warning: V828 Decreased performance. Moving a local object in a return statement prevents copy elision. texture_nau.cpp 78

In this code snippet, the function returns the object of the eastl::unique_ptr<uint8_t[]> type. When the return type matches the type of the return value—and that value is a local variable—the compiler can optimize the process in one of the following ways:

  • Named Return Value Optimization (NRVO): The object can be created directly when the function is called, eliminating any unnecessary move/copy operations.
  • Move constructor: If NRVO can't be applied, the compiler can transfer object ownership via the move constructor.
  • Copy constructor: If neither NRVO nor move is possible, the copy constructor is used.

When std::move is used, the expression in the return statement no longer matches the return type, which prevents the compiler from applying NRVO. This leads to superfluous overhead, as the compiler is forced to use move/copy operations instead of the more efficient NRVO.

To avoid this pessimization caused by the std::move(newData) expression, the std::move call should be removed.

Fragment N5

void configureVirtualFileSystem()
{
  ....
  fs::path currentPath = fs::current_path();
  auto& props = getServiceProvider().get<GlobalProperties>();

  if (auto contentPath = props.getValue<eastl::string>("contentPath");
      contentPath)
  {
    const std::filesystem::path path = contentPath->c_str();
#ifdef NAU_PACKAGE_BUILD
    for (const auto& entry : fs::directory_iterator(path))
    {
      if (   entry.is_regular_file()
          && entry.path().extension() == ".assets")
      {
        const auto& filePath = entry.path();
        auto assetPackFS = io::createAssetPackFileSystem(
          strings::toU8StringView(filePath.string())
        );
        vfs.mount("/packs", assetPackFS).ignore();
        assetDb.addAssetDB("packs/assets_database/database.db");
      }
    }
#else
    auto contentFs = io::createNativeFileSystem(path.string());
    vfs.mount("/content", std::move(contentFs)).ignore();

    auto assetDbPath = path.parent_path() / "assets_database";
    vfs.mount("/assets_db", std::move(contentFs)).ignore();

    assetDb.addAssetDB("assets_db/database.db");
#endif
  }
}

The PVS-Studio warning: V808 'currentPath' object of 'path' type was created but was not utilized. default_application_delegate.cpp 101

The analyzer has detected that the currentPath variable created to store the current path via fs::current_path() is never used within the configureVirtualFileSystem() function. This suggests that after refactoring, the variable will be redundant and can be safely removed without impacting the function's logic.

Fragment N6

class ThreadPoolExecutor final : public Executor,
                                 public IRuntimeComponent
{
public:
  ThreadPoolExecutor(std::optional<size_t> threadsCount)
  {
    const size_t maxThreads = threadsCount ? *threadsCount
                                           : getDefaultThreadsCount();
    m_threads.reserve(maxThreads);

    for (size_t i = 0; i < maxThreads; ++i)
    {
      m_threads.emplace_back([](ThreadPoolExecutor& executor,
                                size_t threadIndex)
      {
        threading::setThisThreadName(
          std::format("Nau Pool-{}", threadIndex + 1)
        );
        executor.threadWork();
      }, std::ref(*this), i);
    }

    RuntimeObjectRegistration{nau::Ptr<>{this}}.setAutoRemove();
    void();
  }
  ....
};

The PVS-Studio warning: V607 Ownerless expression 'void ()'. thread_pool_executor.cpp 60

The constructor of the ThreadPoolExecutor class creates worker threads. At the end, as a cherry on the cake, the constructor body is decorated with the void(); construct that serves no practical purpose. How it got there is unclear; it might be a leftover from incomplete refactoring or an abandoned design choice.

Fragment N7

struct ScheduledArchetypeComponentTrack
{
  ....
  ScheduledArchetypeComponentTrack() {}
  ....
};

The PVS-Studio warning: V832 It's better to use '= default;' syntax instead of empty constructor body. ecsQueryInternal.h 35

The ScheduledArchetypeComponentTrack structure has a user-defined constructor. However, it'd be better to declare it as default, making the class trivially constructible. This allows the compiler to generate more optimized code. Moreover, some algorithms may select a more efficient strategy (benchmark) when working with trivially default constructible objects.

Here is the enhanced code:

struct ScheduledArchetypeComponentTrack
{
  ....
  ScheduledArchetypeComponentTrack() = default;
  ....
};

Here are other cases:

  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_drvDecl.h 149
  • V832 It's better to use '= default;' syntax instead of empty constructor body. frustumClusters.h 79
  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_hlsl_floatx.h 17
  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_hlsl_floatx.h 38
  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_hlsl_floatx.h 60
  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_hlsl_floatx.h 87
  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_hlsl_floatx.h 104
  • V832 It's better to use '= default;' syntax instead of empty constructor body. dag_hlsl_floatx.h 122
  • V832 It's better to use '= default;' syntax instead of empty destructor body. lowLatencyStub.cpp 47

Fragment N8

if (!this->hasComponent(depConstString.hash) 
&& (!optional || (optional && !can_skip_optional)))
{
  ....
}

The PVS-Studio warning: V728 An excessive check can be simplified. The '||' operator is surrounded by opposite expressions '!optional' and 'optional'. dataComponent.cpp 283

In this case, we can simplify the (!optional || (optional && !can_skip_optional)) expression. The right-hand side of the operator || is only evaluated if optional is contextually converted to true. In this case, the left operand of the operator && will always be true, making the check redundant. Hence, developers could simplify this code and enhance its readability in the following way:

if (!this->hasComponent(depConstString.hash) 
&& (!optional || !can_skip_optional)) 
{
  ....
}

Fragment N9

// performance_profiling.h

static const nau::PerfTagFlag NAU_PERFTAGS = ....;

The PVS-Studio warning: V1043 A global object variable 'NAU_PERFTAGS' is declared in the header. Multiple copies of it will be created in all translation units that include this header file. performance_profiling.h 23

Declaring constants in the header file may seem straightforward heterogeneous, but the devil lies in the details. In C++, objects declared as const within a namespace—including the global namespace—have internal linkage. This means that when the header file is included in multiple translation units, each unit gets its copy, potentially increasing the executable size.

To avoid this issue, you may consider using one of the following approaches.

Before C++17: Split the declaration and definition of constant. In the header file, declare the constant using the extern specifier, and define the actual constant in the implementation file.

// performance_profiling.h
extern const nau::PerfTagFlag NAU_PERFTAGS;  // Declaration

// performance_profiling.cpp
const nau::PerfTagFlag NAU_PERFTAGS = ....;  // Definition

Starting with C++17: Declare the constant in the header file using the inline specifier. This ensures that only one constant object exists across the entire program.

// performance_profiling.h
inline static const nau::PerfTagFlag NAU_PERFTAGS = ...;

Fragment N10

std::string Paths::getAssetsPath() const
{
  if (m_paths.find("assets") == m_paths.end())
  {
      return "";
  }

  return m_paths.find("assets")->second;
}

The PVS-Studio warning: V838 Temporary object is constructed during the call of the 'find' function. Consider using an ordered associative container with efficient lookup to avoid construction of temporary objects. file_system.cpp 421

Let's take a look at how the Paths::m_paths data member is declared:

class SHARED_API Paths
{
  ....
private:
  ....
  std::map<std::string, std::string> m_paths;
};

The Paths::m_paths data member is a sorted associative container. Its member function, std::map::find, expects an argument of the same type as the key. This means that the string literal will be converted into std::string, potentially entailing dynamic memory allocations.

Such a search is called homogeneous, when the parameter type of the search functions matches the key type within the container. Since C++14, heterogeneous search has been introduced for sorted associative containers. This means that the parameter type may not match the key type within the container. This can boost performance by eliminating the need for an additional conversion.

To enable heterogeneous search, developers could pass std::less<> specialization as the third template argument to the std::map:

class SHARED_API Paths
{
  ....
private:
  ....
  std::map<std::string, std::string, std::less<>> m_paths;
};

In addition to this optimization, we may also notice that the object by the "assets" key is searched twice in the container, which looks very suboptimal. Here's the possible fix:

std::string Paths::getAssetsPath() const
{
  auto it = m_paths.find("assets");
  if (it == m_paths.end()) return {};
  return it->second;
}

Fragment N11

class DeferredRT
{
  ....
protected:
  ....
  StereoMode m_stereoMode = StereoMode::MonoOrMultipass;
  int m_numRt = 0, m_width = 0, m_height = 0;
  nau::string m_name;

  d3d::SamplerHandle m_defaultSampler;
  ResizableResPtrTex m_mrts[MAX_NUM_MRT] = {};
  ResizableResPtrTex m_depth;

  bool m_useResolvedDepth = false;
};

DeferredRT::DeferredRT(const char* name,
                       int w, int h,
                       StereoMode stereoMode,
                       unsigned msaaFlag,
                       int numRT,
                       const unsigned texFmt[MAX_NUM_MRT],
                       uint32_t depthFmt)
  : m_stereoMode(stereoMode)
{
  m_name = name;
  ....
}

The PVS-Studio warning: V818 It is more efficient to use an initialization list 'm_name(name)' rather than an assignment operator. deferredRT.cpp 117

In the DeferredRT class, the m_name data member is initialized within the constructor body. However, this approach isn't as efficient as it could be. When the control flow enters the constructor body, the member initializer list has already been executed. If the data member is unspecified in it, it'll be initialized by default. For a string initialized by default, the following typically happens:

Subsequently, the operator = is called in the constructor body, which effectively discards the result of the default initialization. To fix it, just initialize the data member in the member initializer list of the constructor:

DeferredRT::DeferredRT(const char* name,
                       int w, int h,
                       StereoMode stereoMode,
                       unsigned msaaFlag,
                       int numRT,
                       const unsigned texFmt[MAX_NUM_MRT],
                       uint32_t depthFmt)
  : m_stereoMode(stereoMode)
  , m_name(name)
{
  ....
}

Here are other cases:

  • V818 It is more efficient to use an initialization list 'm_stream(stream)' rather than an assignment operator. nanim_asset_container.cpp 29
  • V818 It is more efficient to use an initialization list 'm_stream(stream)' rather than an assignment operator. material_asset_container.cpp 51
  • V818 It is more efficient to use an initialization list 'm_stream(shaderPackStream)' rather than an assignment operator. shader_asset_container.cpp 60
  • V818 It is more efficient to use an initialization list 'c(p)' rather than an assignment operator. dag_bounds3.h 209

Wrap-up

We've explored the main approaches that developers can use to enhance Nau Engine performance. Identifying and addressing these issues is a crucial step toward building efficient, stable, and engaging games. In the next article, we'll talk about common errors in class design.

Thank you for reading :)

Posts: articles

Poll:

Subscribe
and get the e-book
for free!

book terrible tips
Popular related articles


Comments (0)

Next comments next comments
close comment form
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you do not see the email in your inbox, please check if it is filtered to one of the following folders:

  • Promotion
  • Updates
  • Spam