AStar - RecastGraph, RichAI - Performance?

Hi Aron,

I’m using RecastGraph and on the agent side RichAI/AIDestinationSetter/RVOController. I’m currently optimizing my game and it seems that AStar is currently using most of the CPU on the main thread, by far.

The numbers below are way higher than in reality because I was deep profiling, however they are consistent with what I usually see: AStar (or rather the RichAI class) seesm to eat a lot of time on the main thread.

Question 1
The levels are completely static. From the moment an agent becomes active, until its death, there is nothing in the level that can change. The agents spawn in one spot, and navigate to one other spot. The only thing that can happen is that sometimes they change their mind and navigate to the player (who is obviously moving around the place). If that happens, I set the players location as their target over and over. This isn’t the case though when I measured with the results below.

What I’m asking is: With the level / the graph itself not changing (no navmesh cuts either), what can I do to stop the agents from working more than they have to? I tried setting “Recalculate paths automatically” to “never”. But that doesn’t just affect Recalculation, it means the agents simply don’t bother calculating a patch - not even when I give them a new target.

Is there anything I can do to only have them work out a new path when I change the destination? I’m happy working that out myself. Like set “Recalculate paths automatically” to “never” and then manually nudge them to create a new path after I changed their destination. Is this possible? And if so, can that still work with local acoidance/RVOController?

Question 2
How come RichAI uses this much time on the Main Thread, when the Thread Count on the AStar component is set to “Automatic High Load” (which uses 24 threads on the machine in question)?

[Deep profile]

[Agent configuration]

Right, I’m answering my own post :slight_smile:

I’ve now changed the following things:

  • “Recalculate paths automatically” is set to “never”
  • Whenever I set a new destination on the AIDestinationSetter, I call SearchPath() on the RichAI
  • If I set the destination to a “moving” target (like the player), I keep calling SearchPath() 3x per second
  • If I set the destination to a static target, the agent only calls SearchPath() 1x (when I set the destination)

Calling SearchPath() on the RichAI has to happen 1 frame after setting the destination of the AIDestinationSetter, otherwise it simply won’t do anything.

    private IEnumerator RichAISearchPath()
    {
        yield return new WaitForEndOfFrame();
        // This needs to happen 1 frame after _aiDestinationSetter.target has been changed
        _richAI.SearchPath();
    }

This has reduced the CPU usage that SearchPath() generates to almost 0.

There is one side effect: In a test scene, the agents have to traverse a huge open area from their spawn point to their end point. Randomly, some agents show erratic behaviour towards the end. Every few seconds, they turn around, backtrace for a second or two, and then continue on to their path.
This is a behaviour I can’t observe if I set recalulate to “every 0.5s”. I can also observe this behaviour though, if I set recalculate to “dynamically, at least every 4s”. So that’s not a side effect of the path calculation happening only once.

This behaviour can be observed by about 10-15% of all agents, completely randomly. I’ve not been able to reproduce this behaviour in “real life” conditions, that is in my games levels where the geometry is more complex, so it might have to do with the agents traversing a big flat area with no obstacles.

So this seems to work, which leaves me still with the same questions though:

  • Is this a good idea if the level doesn’t change? It seems to work, and the difference in CPU load is gigantic.
  • Why does AStar hog that much CPU time on the main thread if pathfinding is supposed to be threaded?

By the way, pathfinding is still (by a long shot) the most expensive thing on my CPUs main thread. Deep profiling again, BatchedEvents.FixedUpdate() takes between 3 and 8ms (out of a total time of 25-35ms per frame).

I assume there’s not much I can do here, since this seems to be the logic for actually moving the agents:

Hi

I’m not sure why your search settings would have improved performance in this case. Though I do note that in your first profiling screenshot you had 288 agents, but in the last one you only had 54.
Everything you see in the profiler is from the movement of the agents. The pathfinding itself happens in separate threads (this is what the multithreading option controls). The movement scripts themselves are not multithreaded, though.

However, I am working on new burst-enabled movement scripts which are multithreaded and significantly faster.

For now, though. I see that RichFinnel.FindWalls takes a lot of time. You can reduce the impact of this by setting RichAI → wall force and wall dist to zero.

Thanks for the reply. If the pathfinding runs in a separate thread, then I’m also confused as to why significantly reducing the number of SearchPath() calls significantly reduces the load on the main thread. I’m looking forward to the new version though. Is there any way I could be notified once the multithreaded movement scripts are available? Pathfinding is a very sensitive core part of my game so I only update AStar when I have a very good reason (in other words I don’t just update to each new version and automatically get the benefits).

The number of agents in the first and second screenshots is different but not quite as much as you point out.

In the first screenshot, one frame took 113 ms. In those 113ms, FixedUpdate was called 6 times (check the second line), which means the 288 calls further below is 48 agents being moved 6 times each.

In the second screenshot, rendering a frame took 33ms, which included 2 FixedUpdates, which in turn means we had 27 agents resulting in 54 calls.

So, comparing just these two snapshots of profiler data (which weren’t cherry picked to show the best/worst behaviour), we get this anecdotal data:

Total Frame Time:
Old method (SearchPath every 0.5s): 113ms/48 agents = 2.354ms per agent
New method (SearchPath once): 33ms/27 agents = 1.22ms per agent

Time spent on BatchedEvents.DoEvents() for 1 frame:
Old method (SearchPath every 0.5s): 67.44ms/48 agents = 1.405ms per agent
New method (SearchPath once): 6.89ms/27 agents = 0.255ms per agent

I’ll check out what happens if I set Wall Force/Dist to 0, thank you for the pointer.

For what it’s worth, setting both Wall Force/Dist to 0 results in BatchedEvents.DoEvents() taking around 0.1ms per agent in the above scenario. At a first glance, I can’t observe any negative side effects in the behaviour of the agents.

1 Like