The most well-known coin to implement CryptoNight is Monero (XMR), though the algorithm was initially created for use with Bytecoin (BCN).
Similar to the Ethash algorithm, the main goal of CryptoNight is ASIC resistance, though it also aims to bolster relevance on CPUs by being more inefficient to run on GPUs.
The algorithm’s performance is extremely sensitive to memory latency, because it includes a loop where memory write operations and subsequent read operations occur repeatedly. The result of the memory-intensive work then determines which hash function to use in a later step, to produce the potential block solution output.
Another design choice in the algorithm was to make the working data the same size as shared cache memory per-core in a modern CPU. Such memory is ultra low latency when compared to normal system DRAM, or a GPU’s VRAM, so sees a significant efficiency advantage running CryptoNight compared to on a GPU.
In the case of the Monero project, the developers have also committed to implementing replacement variants of the CryptoNight algorithm with new versions of the blockchain protocol, changing things slightly so as to scupper the efforts of ASIC designers, as ASICs cannot be reprogrammed after manufacture.
It is rare to see PCs built with multiple CPUs for mining Monero, due to the specialised, high-cost nature of such builds. More commonly, dedicated Monero mining PCs are GPU-based, akin to Ethereum mining PCs, and these also benefit from similar memory clock frequency increases.