28 Replies. 2 pages. Viewing page 1.
Newer [  1  2  ] Older
28.
 
Re: Morning Mobilization
Mar 13, 2017, 20:52
28.
Re: Morning Mobilization Mar 13, 2017, 20:52
Mar 13, 2017, 20:52
 
https://community.amd.com/community/gaming/blog/2017/03/13/amd-ryzen-community-update

"AMD believes that the Windows® 10 thread scheduler is operating properly for 'Zen,' and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture." (as posted on Blues today)
27.
 
Re: Saturday Tech Bits
Mar 12, 2017, 23:39
27.
Re: Saturday Tech Bits Mar 12, 2017, 23:39
Mar 12, 2017, 23:39
 
Simon Says wrote on Mar 12, 2017, 18:18:
Heck, even Allyn over at PCPer now admits their testing, results and conclusions were flawed as there is enough "spillover" to affect performance and Windows 10 doesn't treat them correctly as NUMA cores:

https://ibb.co/b9uAYv

If it treated them as NUMA cores you'd have a 4 core cpu in most games.... that spill-over 147ns latency with CCX intercon is still faster than 4 cores less used in a properly multi-threading game.

TL;DR.. wait for Zen2

Avatar 54727
26.
 
Re: Saturday Tech Bits
Mar 12, 2017, 21:40
26.
Re: Saturday Tech Bits Mar 12, 2017, 21:40
Mar 12, 2017, 21:40
 

I doubt a fix will be in the next patch tuesday because I bet they got some more "thinking" to do.
"I expect death to be nothingness and by removing from me all possible fears of death, I am thankful to atheism." Isaac Asimov
Avatar 58135
25.
 
Re: Saturday Tech Bits
Mar 12, 2017, 18:18
25.
Re: Saturday Tech Bits Mar 12, 2017, 18:18
Mar 12, 2017, 18:18
 
Heck, even Allyn over at PCPer now admits their testing, results and conclusions were flawed as there is enough "spillover" to affect performance and Windows 10 doesn't treat them correctly as NUMA cores:

https://ibb.co/b9uAYv
24.
 
Re: Morning Mobilization
Mar 12, 2017, 17:43
24.
Re: Morning Mobilization Mar 12, 2017, 17:43
Mar 12, 2017, 17:43
 
NewMaxx wrote on Mar 12, 2017, 14:51:
... and people expecting miracles are ignoring the realities of trade-offs.

"Miracles"

Still haven't heard a proper explanation anywhere in this article, nor anywhere else for this:

Windows 10 - 1080 Ultra DX11:

8C/16T - 49.39fps (Min), 72.36fps (Avg)
8C/8T - 57.16fps (Min), 72.46fps (Avg)

Windows 7 - 1080 Ultra DX11:

8C/16T - 62.33fps (Min), 78.18fps (Avg)
8C/8T - 62.00fps (Min), 73.22fps (Avg)

Source

Or this ( Look at the CS:GO and Rise of the Tomb Raider videos ):

Ryzen Win 10 VS Win 7 comparison @ HardOCP

This has nothing to do with "miracles". And until I hear the explanation for this and/or win 10 ( on which ALL reviews have been conducted ) gets up to speed with win 7 ( on which Ryzen wasn't reviewed anywhere ), I won't jump hastily to conclusions by just writing it off as a tradeoff when it clearly isn't according to real world data from reputable enough sources.

Refusing to run anything on Windows 7 as a comparison on the same system... I find that to be a very discutable way of running an objective analysis on this subject.

This comment was edited on Mar 12, 2017, 18:27.
23.
 
Re: Morning Mobilization
Mar 12, 2017, 14:51
23.
Re: Morning Mobilization Mar 12, 2017, 14:51
Mar 12, 2017, 14:51
 
Scottish Martial Arts wrote on Mar 12, 2017, 09:02:
Having now read the article in full, the CCX and Infinity Fabric architecture do indeed appear to cause the Ryzen to behave as though a Ryzen chip is two 4-core CPUs on a single die. While that may sound like a distinction without a difference, it's actually quite important, and the benchmarks are clearly showing substantially more overhead for a context shift across the Infinity Fabric as opposed to a context shift within a CCX.

tl;dr Operating Systems leverage shared L3 caching to minimize the overhead of swapping processes and threads across multicore CPUs. Since Ryzen partitions its two CCXs, no such optimization is possible for context switches between CCXs. Windows can be patched to minimize cross CCX context switches, but when such a switch is unavoidable then there's nothing Windows can do but incur the extra overhead baked into Ryzen's architecture.

I state how it is configured in my first reply below, but in order to elaborate a bit to support your reply here I'll add more. Nothing you say is wrong - I'm just going to distill it down a bit.

First, the L3 cache in Ryzen is a overfill ("victim") cache, which means anything that doesn't fit into L1/L2 ends up in L3. Second, the interconnect between the two banks of L3 cache (two CCX modules) is rather slow and also has to handle other tasks (e.g., PCI-e). Therefore, if information is in the other bank relative to a CPU requiring that information (assuming the L3 is full), you take a high latency hit. The design is inherently more similar to a dual-CPU. It's possible to optimize for this but it's a bit like "optimizing" for the last 512MB of the GTX 970 - Ryzen's design is far more consistent, of course, but the opposite end would be mirroring the cache (a la SLI/Crossfire) which would be devastating for a 8-core CPU or more intelligently distributing the "spill" which, again, would be more like a 2-CPU situation, but without the dual RAM banks. AMD simply made compromises to bring cost down on this chip, and people expecting miracles are ignoring the realities of trade-offs.

This comment was edited on Mar 12, 2017, 15:08.
22.
 
Re: Saturday Tech Bits
Mar 12, 2017, 14:33
22.
Re: Saturday Tech Bits Mar 12, 2017, 14:33
Mar 12, 2017, 14:33
 
El Pit wrote on Mar 12, 2017, 05:18:
Interesting article about Ryzen. So, the patch won't make it really competitive when it comes to gaming? It is not just Microsoft's fault? No patch will make Ryzen great again?

Just kidding. Ryzen offers quite some performance for the asked price, but for those with deeper pockets, intel is still the first choice.

its all about the best value.. intel hasnt had nearly enough price cuts in recent years.. now they are back at it and thats whats up
21.
 
Re: Morning Mobilization
Mar 12, 2017, 14:32
21.
Re: Morning Mobilization Mar 12, 2017, 14:32
Mar 12, 2017, 14:32
 
Scottish Martial Arts wrote on Mar 12, 2017, 09:02:
Agent-Zero wrote on Mar 12, 2017, 00:39:
seems to imply that a single Ryzen processor appears to act like a dual CPU system, is that correct? thats bizarre, because I figured that multi-core CPUs were already acting like that to some degree, but i suppose it must be different in terms of how the system buses work with that

Having now read the article in full, the CCX and Infinity Fabric architecture do indeed appear to cause the Ryzen to behave as though a Ryzen chip is two 4-core CPUs on a single die. While that may sound like a distinction without a difference, it's actually quite important, and the benchmarks are clearly showing substantially more overhead for a context shift across the Infinity Fabric as opposed to a context shift within a CCX.

thanks for the info - thats really interesting
20.
 
Re: Morning Mobilization
Mar 12, 2017, 13:40
20.
Re: Morning Mobilization Mar 12, 2017, 13:40
Mar 12, 2017, 13:40
 
RedEye9 wrote on Mar 12, 2017, 11:19:
VaranDragon wrote on Mar 12, 2017, 10:55:
Scottish Martial Arts wrote on Mar 12, 2017, 10:49:
HoSpanky wrote on Mar 12, 2017, 10:21:
I'm not really in the market for a new processor yet, but currently it's still looking like Intel's gonna get my money.

Yeah, for gaming, the i7 7700k is definitely the best bang for buck right now. If you're a video editor, or software engineer, or anyone who does work that does benefit from parallelization Ryzen become a lot more compelling: if you want to stick with Intel and go beyond four cores, then you're starting to enter server CPU land, with all the attendant price increases.

No one needs an i7 for gaming. An i5 7600K that runs at a stock 4.2 Ghz turbo boost, and can probably hit 5 Ghz with adequate cooling is more than enough for any gaming rig.
More games spec a 4+ core cpu. If you're looking to get any future proofness out of a new build, or you play the latest AAA releases, than an i7 should be on your short list.

PC games have really been slow on the uptake in making use of multithreaded CPUs. Both of these CPUs have 4 physical cores each, although the i7 7700k does feature multithreading which would make it faster in any applications that do use more than 4 cores. You are correct that it would be better for a futureproofed system. However in that case wouldn't it be better to go for a full 8 core processor?

I still think that waiting for software (games in our case) to catch up and make full use of the available hardware is preferable to any kind of "future proofing", since by definition the future is kind of hard to predict, and Intel now definitely has some competition again in that segment. (It's still the king of single threaded speed, which is why you see it beat Ryzen in almost any gaming benchmark)
Avatar 58327
19.
 
Re: Morning Mobilization
Mar 12, 2017, 11:19
19.
Re: Morning Mobilization Mar 12, 2017, 11:19
Mar 12, 2017, 11:19
 
VaranDragon wrote on Mar 12, 2017, 10:55:
Scottish Martial Arts wrote on Mar 12, 2017, 10:49:
HoSpanky wrote on Mar 12, 2017, 10:21:
I'm not really in the market for a new processor yet, but currently it's still looking like Intel's gonna get my money.

Yeah, for gaming, the i7 7700k is definitely the best bang for buck right now. If you're a video editor, or software engineer, or anyone who does work that does benefit from parallelization Ryzen become a lot more compelling: if you want to stick with Intel and go beyond four cores, then you're starting to enter server CPU land, with all the attendant price increases.

No one needs an i7 for gaming. An i5 7600K that runs at a stock 4.2 Ghz turbo boost, and can probably hit 5 Ghz with adequate cooling is more than enough for any gaming rig.
More games spec a 4+ core cpu. If you're looking to get any future proofness out of a new build, or you play the latest AAA releases, than an i7 should be on your short list.
"I expect death to be nothingness and by removing from me all possible fears of death, I am thankful to atheism." Isaac Asimov
Avatar 58135
18.
 
Re: Morning Mobilization
Mar 12, 2017, 10:55
18.
Re: Morning Mobilization Mar 12, 2017, 10:55
Mar 12, 2017, 10:55
 
Scottish Martial Arts wrote on Mar 12, 2017, 10:49:
HoSpanky wrote on Mar 12, 2017, 10:21:
I'm not really in the market for a new processor yet, but currently it's still looking like Intel's gonna get my money.

Yeah, for gaming, the i7 7700k is definitely the best bang for buck right now. If you're a video editor, or software engineer, or anyone who does work that does benefit from parallelization Ryzen become a lot more compelling: if you want to stick with Intel and go beyond four cores, then you're starting to enter server CPU land, with all the attendant price increases.

No one needs an i7 for gaming. An i5 7600K that runs at a stock 4.2 Ghz turbo boost, and can probably hit 5 Ghz with adequate cooling is more than enough for any gaming rig.
Avatar 58327
17.
 
Re: Morning Mobilization
Mar 12, 2017, 10:49
17.
Re: Morning Mobilization Mar 12, 2017, 10:49
Mar 12, 2017, 10:49
 
HoSpanky wrote on Mar 12, 2017, 10:21:
I'm not really in the market for a new processor yet, but currently it's still looking like Intel's gonna get my money.

Yeah, for gaming, the i7 7700k is definitely the best bang for buck right now. If you're a video editor, or software engineer, or anyone who does work that does benefit from parallelization Ryzen become a lot more compelling: if you want to stick with Intel and go beyond four cores, then you're starting to enter server CPU land, with all the attendant price increases.
16.
 
Re: Morning Mobilization
Mar 12, 2017, 10:21
16.
Re: Morning Mobilization Mar 12, 2017, 10:21
Mar 12, 2017, 10:21
 
Let's not forget that most games still aren't particularly multi-thread friendly. The games that I see hitting performance walls on my machine are doing so with a single CPU core maxed out. Throwing more cores at that won't solve anything. Oh sure, game devs *should* code their games to be more multi-core-friendly, but good luck getting already-completed games rebuilt.
When benchmarks came out, Ryzen's single core performance versus Intel was pretty sad, and no scheduler bug is responsible for that. I'm not really in the market for a new processor yet, but currently it's still looking like Intel's gonna get my money.
Avatar 15603
15.
 
Re: Morning Mobilization
Mar 12, 2017, 09:02
15.
Re: Morning Mobilization Mar 12, 2017, 09:02
Mar 12, 2017, 09:02
 
Agent-Zero wrote on Mar 12, 2017, 00:39:
seems to imply that a single Ryzen processor appears to act like a dual CPU system, is that correct? thats bizarre, because I figured that multi-core CPUs were already acting like that to some degree, but i suppose it must be different in terms of how the system buses work with that

Having now read the article in full, the CCX and Infinity Fabric architecture do indeed appear to cause the Ryzen to behave as though a Ryzen chip is two 4-core CPUs on a single die. While that may sound like a distinction without a difference, it's actually quite important, and the benchmarks are clearly showing substantially more overhead for a context shift across the Infinity Fabric as opposed to a context shift within a CCX.

As you probably already know, a process is a program that has been loaded into memory and is in a runnable state. More importantly, a process has associated with it a virtual memory space, i.e. each memory address the program references is not the location in physical memory but rather a mapping to the actual physical address. Additionally, the process maintains a series of data structures containing the state of the process, i.e. the contents of the various CPU registers at a given moment of execution, among other things, allowing it to be stopped and started with the process itself none the wiser. However, if a process needs to communicate with another process, the virtualized address space means that they are effectively partitioned from one another, and while there are several different ways to handle inter-process communication, they basically all involve one or more processes blocking (waiting) on I/O calls, which slows down processing dramatically. It would be far faster if you could just share a virtualized address space, and that's what threads do: they are mini-processes that maintain their separate execution states but share their virtualized address space allowing rapid in-memory communication without resorting to high-overhead I/O calls or context switches (switching which process is actively running on a core). The downside of threads is that the shared address space makes it extremely easy to write code where the results depend on the order in which the CPU schedules threads (race condition) which leads to all sorts of subtle and hard-to-detect bugs.

What this means for our discussion is that it's possible to have parallel execution, or at least a facsimile of it, with even a single core CPU. The OS let's one process/thread run either until it hits a specific time limit (quantum) or blocks on I/O, and then the OS saves the state of the process and swaps in another one to let it run. Since CPUs execute literally billions of instructions per second, this all happens so fast that to us slow humans, it appears like the computer is doing multiple things at once. Add more cores, and you really can run more processes/threads at once. Importantly, with a HyperThreaded (Intel's term) CPU, you're still only executing one process/thread per core at any given instant, but in the case of threads, with their shared address space, architectural optimizations allow for extremely fast context switches between the threads of a process, which in turn means the CPU presents itself to the OS as having x physical cores, each composed of y (usually two) virtual cores.

The hypothesis was that the Ryzen wasn't correctly enumerating which of its virtual cores were part of the same physical core to Windows, and therefore the NT kernel wasn't scheduling threads to the right virtual core in order to leverage the AMD equivalent of HyperThreading. But the benchmarks show that isn't the case: Windows understands which virtual cores belong to which physical core, it's just that the Ryzen architecture has a heavy context shift penalty across the two CCXs. A patch to Windows will allow it to optimize for avoiding context shifts across the CCXs, but that still won't change the fact that any context shift across CCXs will be much slower than a context shift within a CCX.

Why does this happen? Without knowing the specifics of the Ryzen architecture, it has to come down to caching. Memory varies in it's speed, it's volatility, and it's price. Accordingly, on the CPU die itself you have several levels of expensive to manufacture but extremely fast memory that loses its state on power down (volatile); then you have main memory, which isn't as fast and still loses its state on power down, but is much cheaper than on-die cache memory; and finally you have persistent storage in the form of SSDs and HDDs, which are very slow to access relative to the other memory, but make up for it by being very inexpensive.

The trick for an operating system is to place the code that is most likely to be needed, and needed repeatedly, in the fastest levels of cache, so as to optimize the CPUs instruction retrieval speed (it can't execute what's still in transit over the various busses). Aside from repeatedly executed procedures and repeatedly retrieved data, the OS is probably also going to put some or all of its process bookkeeping data in some level of on-die cache, so it can minimize the time necessary to context switch between processes. Again, without knowing the specifics of the Ryzen architecture, if the two CCXs are maintaining separate, likely L3, caches, then any context switch across CCXs loses the context switch optimization which the shared cache within the CCX provides.

So where does this leave us? A patch to Windows can make the NT kernel's scheduling algorithm aware of the Ryzen context switch penalty across CCXs, and allow it to be smarter, but the penalty will still be baked into the Ryzen architecture no matter what Microsoft, or any OS developer does.

tl;dr Operating Systems leverage shared L3 caching to minimize the overhead of swapping processes and threads across multicore CPUs. Since Ryzen partitions its two CCXs, no such optimization is possible for context switches between CCXs. Windows can be patched to minimize cross CCX context switches, but when such a switch is unavoidable then there's nothing Windows can do but incur the extra overhead baked into Ryzen's architecture.
14.
 
Re: Morning Mobilization
Mar 12, 2017, 08:05
14.
Re: Morning Mobilization Mar 12, 2017, 08:05
Mar 12, 2017, 08:05
 
Agent-Zero wrote on Mar 12, 2017, 00:39:
this quote from the article:

In this way at least, the CCX design of 8-core Ryzen CPUs appears to more closely emulate a 2-socket system.

seems to imply that a single Ryzen processor appears to act like a dual CPU system, is that correct? thats bizarre, because I figured that multi-core CPUs were already acting like that to some degree, but i suppose it must be different in terms of how the system buses work with that

Yes, it acts like a dual socket CPU system, in that it has similar problems.

To put it very simply think of those CPUs as a motherboard in a sense, with two CPUs on it (each CPU having multiple cores, 4 core each in this case).

Everything is working fine until something from CPU B basically needs data stored in CPU A. Getting that data from one to the other takes time, and during that time the process can't do its work so it's just waiting. This could have a kind of knock on effect as during that time something on CPU A could also be waiting for results from CPU B, that data/results would then need to be moved back for processing on CPU A which will have to wait for it to get there.

Treating it as a NUMA system means treating it as a muti socket system, it will try avoid processes that share/rely on the same data, from being on the different "CPUs".

If that's the only problem Microsoft should have no trouble addressing it but the improvements aren't going to be big.



It's a sensible way for AMD to go. It helps keep the chips small and cheap, less of the "CPU" (die) is dedicated the different "cores" being able to directly communicate (it becomes increasingly more complex/bigger/costly the more cores you add if each new is one to be directly linked to every other one). While keeping them small means making better use of a wafer and defects are less costly.

13.
 
Re: Saturday Tech Bits
Mar 12, 2017, 06:47
13.
Re: Saturday Tech Bits Mar 12, 2017, 06:47
Mar 12, 2017, 06:47
 
El Pit wrote on Mar 12, 2017, 05:18:
Interesting article about Ryzen. So, the patch won't make it really competitive when it comes to gaming? It is not just Microsoft's fault? No patch will make Ryzen great again?

Just kidding. Ryzen offers quite some performance for the asked price, but for those with deeper pockets, intel is still the first choice.

Yes, the patch will make a difference, don't expect a quantum leap though.

No, it's nobody's "fault". It was AMD's decision to make Ryzen a 2 x CCX design. Windows 10 only needs to be aware of this design, currently it isn't.

In the best case for Ryzen the OS is aware of its 2 x CCX design AND the code is optimized for such a design (certain pro applications are but I doubt the same can be said about games).
12.
 
Re: Saturday Tech Bits
Mar 12, 2017, 05:18
El Pit
 
12.
Re: Saturday Tech Bits Mar 12, 2017, 05:18
Mar 12, 2017, 05:18
 El Pit
 
Interesting article about Ryzen. So, the patch won't make it really competitive when it comes to gaming? It is not just Microsoft's fault? No patch will make Ryzen great again?

Just kidding. Ryzen offers quite some performance for the asked price, but for those with deeper pockets, intel is still the first choice.
"There is no right life in the wrong one." (Theodor W. Adorno, philosopher)
"Only a Sith deals in absolutes." (Obi-Wan Kenobi, Jedi)
Founder, president, and only member of the official "Grumpy Old Gamers Club". Please do not apply.
11.
 
Re: Morning Mobilization
Mar 12, 2017, 01:57
11.
Re: Morning Mobilization Mar 12, 2017, 01:57
Mar 12, 2017, 01:57
 
Agent-Zero wrote on Mar 12, 2017, 00:39:
Scottish Martial Arts wrote on Mar 12, 2017, 00:09:
Agent-Zero wrote on Mar 11, 2017, 17:38:
Im not a Windows or OS programmer, so I have no idea how true this may be - but I would have suspected the lack of performance of new AMD chips in terms of 1080p gaming and other gaming related tasks would be more likely due to the way games themselves are coded...

It's an interplay between how parallelizable the computation is, i.e. how much work can profitably be done simultaneously, how much of the computation is actually parallelized and how well that parallelization is implemented, i.e. how the game/application is programmed, and how the operating system spreads the parallelized computation, e.g. processes and threads, across the multiple cores, and how the OS schedules those processes and threads that have been tasked to a given core.

this quote from the article:

In this way at least, the CCX design of 8-core Ryzen CPUs appears to more closely emulate a 2-socket system.

seems to imply that a single Ryzen processor appears to act like a dual CPU system, is that correct? thats bizarre, because I figured that multi-core CPUs were already acting like that to some degree, but i suppose it must be different in terms of how the system buses work with that

Perhaps to support multithreading? 2 threads per core.
If Russia stops fighting, the war ends. If Ukraine stops fighting, Ukraine ends. Slava Ukraini!
Avatar 22024
10.
 
Re: Morning Mobilization
Mar 12, 2017, 01:56
10.
Re: Morning Mobilization Mar 12, 2017, 01:56
Mar 12, 2017, 01:56
 
Agent-Zero wrote on Mar 12, 2017, 00:39:
seems to imply that a single Ryzen processor appears to act like a dual CPU system, is that correct? thats bizarre, because I figured that multi-core CPUs were already acting like that to some degree, but i suppose it must be different in terms of how the system buses work with that

The R5/R7 chips use two banks of L3 cache instead of shared like, for example, Intel's Broadwell-E. So, yes, it is like a dual-CPU (with each 3/4-core having CCX for one bank). Also, CPUs in the past - like the Core 2 Quad - had a similar configuration with L2 cache. It was often said this kept the Q6600, for example, from being a true quad core (and more like a dual C2D).

This comment was edited on Mar 16, 2017, 19:21.
9.
 
Re: Morning Mobilization
Mar 12, 2017, 00:39
9.
Re: Morning Mobilization Mar 12, 2017, 00:39
Mar 12, 2017, 00:39
 
Scottish Martial Arts wrote on Mar 12, 2017, 00:09:
Agent-Zero wrote on Mar 11, 2017, 17:38:
Im not a Windows or OS programmer, so I have no idea how true this may be - but I would have suspected the lack of performance of new AMD chips in terms of 1080p gaming and other gaming related tasks would be more likely due to the way games themselves are coded...

It's an interplay between how parallelizable the computation is, i.e. how much work can profitably be done simultaneously, how much of the computation is actually parallelized and how well that parallelization is implemented, i.e. how the game/application is programmed, and how the operating system spreads the parallelized computation, e.g. processes and threads, across the multiple cores, and how the OS schedules those processes and threads that have been tasked to a given core.

this quote from the article:

In this way at least, the CCX design of 8-core Ryzen CPUs appears to more closely emulate a 2-socket system.

seems to imply that a single Ryzen processor appears to act like a dual CPU system, is that correct? thats bizarre, because I figured that multi-core CPUs were already acting like that to some degree, but i suppose it must be different in terms of how the system buses work with that
28 Replies. 2 pages. Viewing page 1.
Newer [  1  2  ] Older