You wrote that disabling hardware acceleration (decoding) on the client, the performance improves. It would seem that the HW is to blame, not really NoMachine.
You could try using H.264 software encoding, you can follow the instructions here: https://www.nomachine.com/AR10K00706. This should help diminish the use of the CPU.
In its current design NoMachine sends the content of all the remote monitors even though you may be viewing the content of one of the displays only client-side. This is to ensure a smoother experience in visualizing the content when you’re switching among monitors.
This behaviour is becoming less convenient nowadays what with monitor resolutions getting larger. We are already working on improving how NoMachine handles the encoding of multi-monitors, so that each monitor is encoded separately and content is transmitted separately to the client.
Other ways that you could further improve CPU usage is by disabling “client side image post-processing”. This applies a filter to improve the quality of the image, but consumes more CPU. This is one feature which I am sure other remote access products don’t have.