> All the cores or processors share the same memory (RAM) and IO bus.
Not always so.
For example all Intel x86-based servers with more than one CPU socket are configured so that both CPUs (sockets) have their local RAM and I/O. This is called as NUMA, non-unified memory access.
Assuming all memory to be local can very quickly saturate the link between CPU sockets. Even to the point where that fancy 48 core multi-CPU socket system performs slower than a single CPU socket system with 4 cores.
> Applications use multiple threads mainly to: ... increase responsiveness to external events such as: ... messages from other systems
Multithreading often decreases performance in that case, because thread context switching is far from free.
"the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors)"
Don't confuse "Uniform" and "Unified". That's different words.
The sentence you quoted [and try to disprove] is actually correct. Multiple cores and multiple sockets are sharing access to the RAM. If two cores wants to access it at the same time, one will have to wait.
Each socket and each core have internal caches, so they don't have to wait for each other [as far as they run operations on distinct memory spaces]. Hence memory access is "Non Uniform" because the time an access will take depends on where the data is currently located.
(If we really want to put "unified" in a sentence somewhere, the memory architecture of x86 is unified. The design is well defined and all system components have the same constraints.)
No comment on the content, still making my way through it, but there's a ligature/glyph rendering issue with all lowercase "if" sequences which is distracting. If the author is reading the comments, maybe they can fix it.
Not always so.
For example all Intel x86-based servers with more than one CPU socket are configured so that both CPUs (sockets) have their local RAM and I/O. This is called as NUMA, non-unified memory access.
Assuming all memory to be local can very quickly saturate the link between CPU sockets. Even to the point where that fancy 48 core multi-CPU socket system performs slower than a single CPU socket system with 4 cores.
> Applications use multiple threads mainly to: ... increase responsiveness to external events such as: ... messages from other systems
Multithreading often decreases performance in that case, because thread context switching is far from free.