I don't know much about chip fabrication but reading wikipedia I see this:
> The term "3 nanometer" has no relation to any actual physical feature (such as gate length, metal pitch or gate pitch) of the transistors. ... a 3 nm node is expected to have a contacted gate pitch of 48 nanometers and a tightest metal pitch of 24 nanometers... "3 nm" is used primarily as a marketing term by individual microchip manufacturers ... there is no industry-wide agreement among different manufacturers about what numbers would define a 3 nm node
So can someone who is more familiar with it help me understand what progress we're making in terms of the physical aspect of the chips? Is transistor density still increasing? Are we using different materials? At a physical level what's different about this generation of chips?
For the first version of TSMC 3nm, die sizes are projected to be ~42% smaller than TSMC 5nm and you have a choice of using ~30% less power or improving performance by ~15%.
While it's unlikely scaling continues to this level, I got this funny idea: eventually the marketing gate size might become smaller than a single atom!
We're close enough to see that on the horizon; about 2 angstroms is the width of a silicon atom. Or well, distance between two nuclei in a silicon lattice. 'Width' of an atom is a tricky concept with many definitions.
Hence AMD using different process nodes for the stuff that scales and the stuff that doesn't at the same time with their chiplets, so they can get the best of both worlds.
Didn't AMD start down the chiplet path way before the horrible SRAM scaling became clear? I was the under the impression they were chasing better yield and this is just another handy gain.
3D V-Cache in the only SKUs on the market right now (5800X3D and Milan-X) is 7nm, just like the CCD beneath it. IOD is Glofo 12nm on that.
On GPUs this strategy isn't working out too well for them right now. Their 7900 is a lot more silicon with a lot more transistors (300 mm² of N5, 220 mm² of N6, 58bn xtors) compared to the competition (380 mm² of N4, 46bn) [1] and is marginally faster in some cases, drastically slower in others, and uses much more power in every case.
[1] While AMD uses two slightly older processes, AIUI the overall packaging and silicon costs are speculated to be quite a bit higher for AMD compared to nVidia here.
I don't think comparing chiplet sizes by summing the area is a good way of doing it, because smaller nodes cost more and have lower yield, and smaller individual chiplets mean way less loss per-wafer, as the bigger the size of a single usable unit, the more chance something is broken and it isn't usable. Losing one or two small chiplets is a lot cheaper than losing entire chips in a monolithic setup.
AMD claim that producing a 16-core Ryzen CPU without chiplets would have cost them twice as much. You say that the GPU strategy isn't working out, but I think that really depends on what their pricing strategy is. I suspect they just don't feel they can take the market right now, and are making a lot of margin while the market will support it, rather than trying to compete on price at this moment, but I admittedly don't have any evidence for that.
I guess we'll see, but it sure seems like there is a lot of advantage to building the way they have to me, I'd personally be surprised if it didn't pay off.
Yes, transistor densities are increasing, interconnects are getting denser as well (that's what the "metal pitch" is about). There are also improvements in power consumption.
I imagine there is more use of EUV and surely material compositions are getting tweaks as well.
This is pretty much as it has been for the last couple of decades, though at a somewhat slower rate and with an increasing number of caveats. For example, the density of SRAM has recently been decreasing more slowly than the density of logic. What's particularly tricky are the economics: the cost per transistor isn't really going down much anymore.
Transistors at that scale are 3D rather than 2D, so the expected scaling ratios no longer apply. If you wanted double the density of a 2D transistor with 5nm features, you'd need to scale down to 3.5nm, but if you wanted to double the volumetric density in 3D, you'd only need to go down to 4nm.
How do you make meaningful compare to past generations with that being the case? You could then just describe it in 2D terms, and refer to your 4nm process as 3nm. Then all of the same rules of thumb still hold.
I assume this is responsible for at least some portion of the deviation and confusion, though maybe not all of it. Happy to be corrected if I'm wrong.
The industry roadmap has adopted a new nomenclature to try to clear this up, but it remains to be seen if the major companies will adopt it.
In any case, people who jump to saying these process node names are bullshit because they don't simply map to something like a wire width are kinda naively missing the point. Everyone in the industry understands what's going on and these fab node names emerged naturally as a shorthand way of describing changes to design rules that are a lot more complex and varied. That's still true today.
Also, in practice, no one cares how "tall" the chip is — 100nm vs 200nm is ... unimportant. The result is that the volume of the poly hasn't changed much in the last ~10 years. Transistors are still biggish; just with tiny footprints. (And really, it's pitch density: we can pack the poly tighter.)
Not quite. We basically don't care about 3D density at the moment. The plane of the wafer is ultimately what matters from a density perspective. Right now we're actively cutting into 3D density heavily for small 2D density gains.
I assume you care about 2D vs 3D for different variables, no? I agree 3D, at least while H<<W, doesn't matter for clock speeds, but wouldn't it still play a significant role in the power characteristics, particularly things like gate capacitance? Genuinely curious. Going off of my memory of my Physics of CompE class ~10 years ago at this point.
If the gates of the transistor are taking up all that space, but FinFETs and newer gate all around FETs have significantly sized voids in 3D space, you just can't practically use those voids for anything else without cutting into yields or having awful leakage issues.
Smaller marketing numbers generally do relate to improved performance, usually from minor optimizations across the board like needing to move less charge to switch a transistor, etc. Also certain numbers introduce things like FinFET or gate-all-around that further increase performance.
We are now past the point where just simple size shrink will yield significant benefit, due to physical limits no less. Most of the heavy lifting in terms of performance improvement comes from microarchitecture advancements
Look at FinFET transistors; this is the style of MOSFET currently employed at the smallest scales that I know of; This is what started calling the feature size of the process node smaller and smaller past 24nm- at least for recent processes, the smallest parts of the "fin" were getting down in the 10-7nm range, and smaller.... but this isn't the size of the whole transistor, it needs multiple fins on either side to make the gate structure. So this was what happened when we end from talking about MOSFET feature size to the single-digit process node sizes that are essentially marketing terms now verses actual gate size.
There are actual density increases do to the packing of these type of FinFETs vs. more traditional MOSFETS. Thus the efficiency gains that have been happening. Among other reasons.
Not only is the density still improving, but new processes come with better characteristics for dealing with current leakage, smaller voltage requirements, and faster charge carrier propagation.
In other words, the transistors are being redesigned so that they work better.
Things are just not moving on the same rate they used to be, and costs are going up instead of down. But there is plenty of movement.
Moore’s law is over. There are a bunch of competing ways to measure advancements. What matters in the end is design and performance for a specific task.
If 3nm is merely a marketing term that has no relation to the distance between transistors than aren't we approaching a state where we could theoretically go into the negatives?
TL;DR: It's exponential, so it'll never go negative. After 3 comes 2. After 2 (at least for Intel) comes 1.8, or 18 Angstroms. See Wikipedia[0] for a reference.
Longer answer: (someone correct me where I'm wrong)
Historically, the number was a measure of the size of the transistor. 250 nm process nodes were producing planar transistors that were about 250 nm across. A node was then defined as the measure that brought a set increase in performance. In other words, going from 250 nm to 180 nm was the same improvement as going from 180 nm to 130 nm.
About the 45/32 nm mark, we reached the limits of making planar transistors. The machines couldn't get enough resolution to make them reliably. So, FinFET and others were invented as a way to increase performance to what was needed for the next generations. Essentially, the product has the performance of what planar transistors at said size would give, but they're not actually that size.
However, now that a few node generations have passed, even that's not true anymore. Now, it's just a marketing term, and you can't even compare performance between different manufacturers. TSMC's 5 nm has different performance than Samsung's and Intel's. All you can know is that TSMC's 3 nm will be better than their 5 nm.
The Intel 8088 chip was made on a 3 micrometer process node and a decade later they had to measure in nanometers for the 486 which launched on an 800 nm process node.
In terms of how much processing a single CPU core can accomplish (single thread performance) it plateaued about 15 years ago. So if your computation must be single threaded (audio effects processing, audio codecs, Photoshop, any app where computation has to finish before the next computation can begin, some real-time apps) then there has been little performance increase since then. However, if your computation can use multiple cores (video, graphics, user interfaces) the CPU power is approximately multiplied by the number of cores. More cores is like using more computers, so power use tends to increase, but since the task can be broken into multiple independent parts overall it gets done faster. Number of transistors is no longer a good measure of overall computional capability since often added transistors are not powered up most of the time (they are only used for special tasks like cryptography) but even so the number of transistors has also plateaued in the graph. Note the vertical scale is logarithmic, with each increment a factor of 10 over the line below it, so you need a LOT more transistors to keep that number-of-transistors curve going up the way it did in the past.
New technologies such as optical computation and quantum computing might help create even faster single threaded processors, but so far they have had no effect on consumer devices. A lot of the performance limits on CPU's are related to how fast you can feed them data (amount of RAM, RAM bandwidth, long term storage speed, bus bandwidths.) Such bus speeds have also been improving over time, but they still keep CPU's from running at top speed. We do not yet have widespread use of consumer systems where all the apps and data are stored in long term stable RAM rather than on an SSD (though some server apps do run from RAM only), so it is not just the CPU single threaded performance that matters. If apps and data start residing in RAM rather than on disk drives then CPU's can run faster but they'll use a lot more power as well, so cooling may become a bigger problem even if the bus systems can be made faster. Even if computation becomes really fast you generally need to then move the data to and from somewhere else for it to be useful (onto an SSD, a flash drive, network storage) so the speed of that transfer also limits how much computation can be done. So even if your processor could do a color transform on an entire video in a few microseconds, it may take an hour to transfer the video to your tablet, which means it takes an hour to free up space to do the next computation. Only applications which do a lot of math computation without much data I/O (like computing the Mandelbrot set) can really make full use of fast CPU's today, so increasing your I/O speed will usually have a bigger impact than increasing your CPU speed.
What is atomic scale to you? A Silicon atom is about 0.2nm across, and a team at Tsinghua University, China, have built a transistor gate with a length of 0.34 nm.[1] The difference between these lengths is 23 microns smaller than the diameter of the nucleus of a Silicon atom at 0.117nm.
The mind boggles at what is being accomplished in semiconductor production at this scale. The smallest features in the (fin width, e.g.) in the 3nm process are roughly 5nm across. That's the diameter of a hemoglobin molecule. Or, to put it another way, it's about 40 silicon atoms. All this enables single chip processors with over 100 billion transistors, so, more transistors on laptop processor than neurons in your brain. Wholesale, that may run close to 1 ten millionth of a cent per transistor.
All this, and yet the process for producing transistors in a planar layout on silicon (that is, an integrated circuit) was invented after I was born.
Way more complex to be sure. The "more powerful" part requires a bit of interpretation. One transistor certainly can't do the work of one neuron, even given forever to do it. But a not-too-large-to-count collection of transistors can do the work of a single neuron, and do it much faster. I don't know that 100 billion transistors can out compute 80 billion neurons with a 100 trillion synapses, but it's not implausible to me. Neurons have a firing rate on the order of 10Hz; an entire CPU can be clocked at 100 million times that rate. It's certainly not obvious that those 11 orders of magnitude can't make up for the functional limitations of transistors. Obviously standard microprocessor architectures are not comparable to the architecture of the brain, so a true head to head comparison is impossible, but it's not absurd to imagine we've achieved or surpassed parity.
These things are approximate but Hans Moravec made some reasonable estimates back in 1997 that human brain equivalence was about 100 teraflops https://jetpress.org/volume1/moravec.htm
It gets even more complicated when it is hypothesized neurons use spike timings to "compute".
> A potentially more powerful way to construct symbols with spikes is to use spike timing, i.e., to consider exactly when spikes occur within the time window. The maximum number of symbols a neuron can construct with spike timing is limited only by the temporal resolution of the code. In formal terms, the information capacity of spike timing is much higher than the information capacity of spike count.
You probably want to qualify "single chip processors" with "small".
Single chip processors with over 100 billion transistors have been possible for years. We've had trillion-transistor single chip processors since 2019[1].
It's been a great run for the past few decades getting improved performance for "free" (i.e. through extremely advanced CPU engineering), and I'm glad they're still able to eek out another generation of hardware improvement. Curious how hitting limits/diminishing-returns in silicon will force innovation in architecture and software. Will be amazing to see the pendulum swing back toward software optimization. So much performance is being wasted or left on the table.
N3 is actually much worse than expected, it’s essentially a marketing node to satisfy contractual obligations and the expected gains from N3 won’t actually be reached until N3E in another year+.
This means that Samsung’s 3GAP/2GAE (both with GAAFET) will match their density and release timeline…
Even TSMC is hitting delays and diminishing returns, and costs are spiraling faster than density - N3 alone is ~40% more expensive than N5, both in wafer and validation/tapeout costs, for only ~30% higher density in real-world products. And remember that’s the fake N3 now and the real N3E will undoubtedly be even more expensive.
Rumors are that all 10 of TSMC’s top-10 customers have cut their orders next year and they have 50% less utilization than expected, which is really bad for the node economics. So TSMC may finally have found the point at which costs have spiraled past consumers’ willingness to pay.
IE. N7 was much more expensive when it first launched than compared to now? And TSMC prices its wafers based on supply and demand so eventually, N7 will become very cheap.
Also, it's not a surprise that the top 10 TSMC customers have ordered less. It probably has way more to do with a massive inventory of unsold phones and computers, chip glut, and lower projected consumer demand. N3 pricing wouldn't make an impact in 2023.
> N3 is actually much worse than expected
> ~30% higher density in real-world products
I'm a software guy so excuse my ignorance, but isn't 30% higher density a pretty big deal? Why is N3 worse than expected? Were expectations sky high and expectations weren't hit, or is 30% density increase not as big of a deal as it sounds to me.
I think a lot of people may not understand how density increases relate to performance improvements
It's the combination of density and cost increase that's the problem. I don't have the actual N3 numbers, but taking the original example, if you get 30% increased transistor density, but your cost per area goes up 40%, then as a customer you're not in a great position - you're still paying more per transistor.
While there are still other benefits to gain from a new node and increased density (despite the cost increase), if your cost per transistor goes up, it limits where you might want to use the new node (particularly in value sensitive parts of the market).
There's been a long-term trend towards this point - the cost of a new node (the blend of developing it, implementing a design in it, and cost per transistor) has been spiralling up for like a decade+. These are the same pressures that has caused the consolidation around Intel/TSMC/Samsung(ish) in the bleeding edge.
This is something I think in the next 5 years will become a major thing. Broadly speaking from an OS level, the folks that optimize their software the best will see big benefits. I have advocated (lightly so) to some Linux folks that we should start moving to where the situation will be by the end of the decade. Optimize now before we have too.
The pessimist (AKA Stallman) in me is thinking that vendors like Apple and Microsoft are going to use things like their Security processors to force people to buy new hardware. Software optimizations can potentially keep older hardware going for a lot longer, but by having an arbitrary security chip requirement - folks can be forced to purchase new hardware every few years as older model support is dropped. We are already seeing this in Windows 11.
While we will not have a year of the Linux desktop, the ability for the speed optimizations to be more open to all people could be a sizable win for the platform. Provided the ability to do this isn't locked down.
there was a HN post yesterday about replacing Redux with ChatGPT, so I would have to agree maybe we can all aspire to write slightly leaner software with the massive performance laptops tend to have these days!
You mean moving memory closer to compute. And that's happening already in lots of places, even though it brings increased complexity due to having to account for NUMA.
I am thinking long term, as neural nets scale up, we'll have to move compute into memory. The other way around is problematic, small caches don't work well with neural nets. In fact the fastest Transformer (FlashTransformer) is based on principled usage of the SRAM cache because that's the bottleneck.
That's not easy at all because larger caches are slower. A higher hit rate with worse latency can give worse performance. We may see more levels of cache to better optimize the tradeoffs.
> 3nm technology would feature an estimated 60 percent gain in density of logic transistors and reduce power consumption by 30 percent to 35 percent at the same rate compared with 5nm technology
I'm really looking forward to the new Gravitons and the M3 Pro/Max/Ultra MacBooks! and what they will be able to achieve.
Smaller transistors does indeed reduce dynamic power consumption as transistors can switch with less charge, however the leakage current (static power consumption) will rise, more than the already significant amounts. I'm not sure how much actual energy savings will come from this in an application like laptops.
Also, increased density isn't a huge deal as, while it reduces the die size which reduces the cost, it also likely costs more than an already mature process. It can help with increasing the amount of logic gates in series before needing to be pipelined to the next clock cycle, but that's not that big of a deal either
Reducing logic density will still allow for lot's of more SRAM for a chip of the same size. Even if SRAM won't scale. More SRAM is almost always a cheat code for squeezing more performance out of a design.
Logic and SRAM are completely different things in this context. SRAM actually has had almost zero scaling since N7 while logic has more than doubled in density (2x theoretical / 1.5-1.6x real-world shrink at N5 relative to N7, for example, and then N3 gives about 30% real-world shrinks to logic, although that’s less than expected and the good shrinks are going to come from the later N3E node).
As a result we will probably see the pendulum swing back from cache-heavy designs. Going ham on cache was an obviously advantageous strategy at 7nm, you can basically look at it as N7 having been two full node families ahead of the curve on sram density (Samsung is only catching up at 3nm/2nm). But since SRAM hasn’t scaled at all in the last 2 nodes it’s becoming comparatively more effective to spend your transistors on logic instead. You still want big caches of course (and cache can be easily stacked ala AMDs v-cache) but it’s more worthwhile to spend more heavily on logic than it was on 7nm.
I doubt this will happen. Cache is way too important. I would be very surprised to see chips with less cache in the future. I guess we will see in 10 years where the industry has moved. Currently a LOT of the transistor budget is blown on things we don't really need.
Yeah, I don't think we'll see reductions in cache, that's always a hard thing because software comes to expect it and can be hurt by reductions, it's kind of a one-way ratchet. But we'll stop seeing the huge increases gen-over-gen, at least on the primary die. Additional transistors spent will probably mostly be spent on logic rather than cache.
N7 basically was the era of "let's throw cache on everything, even products that traditionally haven't had caches". GPUs never had L3 cache before, for example, but that became advantageous, even in GPUs, which are focused around logic/computation rather than deep cache structures. And now you are seeing that train grind to a halt - RDNA3 did not expand the cache further, although they did increase the bandwidth of the cache (2.7x higher, although bear in mind that RDNA3's memory subsystem is 50% wider which means cache bandwidth is effectively 1.8x higher in a relative sense).
Similarly Intel went completely nuts once they finally got to a 7nm-tier node (Intel 7 aka 10ESF). Raptor Lake in particular is just caches all the way down...
Since cache no longer shrinks at 5nm and 3nm, but logic does, it makes sense to do some logic-intensive things rather than just throwing all your area at cache like 7nm.
On the flip side though, since cache can be pulled out to a separate cache die fairly effectively (AMD v-cache), you can continue to scale cache there. RDNA4 and NVIDIA Blackwell are both rumored to be coming fairly quickly which suggests a potential respin. And both AMD and NVIDIA have things they need to work on, NVIDIA doesn't have DP2.0 and AMD seems to have screwed up RDNA3 fairly badly, so a "similar but improved" quick refresh makes sense. Rumors mentioned "[NVIDIA] Ada Lovelace with a stacked cache die" at once point and imo that is very plausible for the next-gen Blackwell chips as a potential quick-fix improvement to keep scaling Ada.
But I think we're going to see an overall trend towards "the logic die is for logic and L3 cache gets stacked on top" for now. That seems to be a formula that works without too much trouble. L1 and L2 on the logic die is unavoidable, there is too much incentive for proximity/latency improvements, but big stacked L3s seem very effective and doesn't cause MCM-style problems.
I don't think there's anything AMD-exclusive about cache stacking. It's made possible with TSV (through-silicon via) and direct bonding (ultrasonic or Cu-Cu bonding) to make the movement energy very low. You just bond the second die into some via you've left in your main die, it's relatively straightforward, just an evolution of the same HBM/HBM2 principles but with cache instead.
Going forward there will also be non-cache (and non-memory!) things bonded as well.
Mostly SoC related features. Look at the M1 for example. The budget allocated to a DP controller is HUGE. That doesn't have to be on the SoC chip and can be externalized.
I just don't know what is left to achieve really. I have an M1 pro and I just don't really hit any bottlenecks in my workflow as is (unless I am building some obscenely large legacy codebase)
It just seems to me that we are hitting a point of diminishing returns in terms of CPU performance because honestly, the speed of my laptop could triple and it would not noticeably affect my experience in any way.
The main areas of improvement that I would actually notice are better battery life, and faster RAM and SSDs (faster networking as well)
I am a YouTuber, and I spend a significant amount of time editing and rendering video. My main laptop is a full-spec M1 Max MacBook Pro, and when I'm home, I work on a full-spec M1 Ultra Mac Studio.
Both computers are extraordinarily fast, but I still spend a lot of time waiting.
I would be willing to spend a lot of money:
(1) to reduce that time,
(2) to significantly increase my laptop's battery life, and/or
(3) to significantly increase the size of my laptop's already-rather-gargantuan 8TB SSD.
Maybe I should become a programmer. Sounds like there's less waiting :P
This is more GPU than CPU, but I want to infer 3D models from my security cameras in real time so I can do some CSI "turn left and look behind it" shit. And use the overlapping textures for superresolution so I can shout "enhance!" and read the license plate reflected in the perp's eyeball.
As for reading e-mails and so on, yeah, we've pretty much reached peak e-mail.
I haven't described anything that can't already be done (well, reading a license plate in an eyeball was mostly an exaggeration). It just can't be done affordably in real time on a home computer. And it's just the first few out of many examples to come to mind.
I think people fall into the trap of conflating "this is what I do with my computer" with "this is what my computer is for." Obviously if computers are only for doing the things you can already do with them, then they won't benefit much from improvements.
> I just don't know what is left to achieve really.
Just use your imagination a little bit.
Unless you think your current workflow and the tasks you use your machines for are the pinnacle of what an individual will ever be able to accomplish?
Currently there are so many things that are so computationally intensive that they can only be processed on server farms that only the Googles and Amazons of the world can afford.
They are currently just under $50 billion and have bought back $130 billion+ in shares since 2018 I would most if not all of their reserves went into buybacks.
Apple chose to use N3, AMD chose to use N4, and Intel chose to use 7. I think it's fair to give Apple credit for making a tradeoff that leads to a better product.
A few questions for those familiar with chip production:
1. How much smaller can we make the silicon process before we run into physical limitations? How many years do you think that gives us?
2. What are the most promising alternatives to silicon? Carbon nanotubes, photons? How many years are we realistically away from these being commercially viable?
If you were to play Semi-God where you could pick up each atom, we still have at least 200x before we hit atomic feature size. But before you are excited about 200x, that is only about 10 full node step, or roughly 20 years of progress ( assuming the rate stay the same which we all know wont happen ). So let say about ~30 years. 2050
We have a very decent roadmap all the way till 2030. Till roughly around TSMC 1nm and 0.8nm. So 2nm in 2025, 1.4nm in 2027, 1nm in 2029. Intel seems to be executing well so far, so they might have a minor lead by 2025/2026. And may be able to reach 0.8nm by 2030.
We used to question whether there is enough market to sustain the development of leading node. ( Which was the number one issue on the death of Moore's law, but media doesn't mention it much if at all ) But given the current market size of multiple Trillion Dollar companies, and geo-political willingness to invest into silicon. I dont see we have a market / funding problem for the next 10 years.
So yes, you could get a GPU that perform close to a GTX4090 by 2033 and it cost you less than $200.
> we still have at least 200x before we hit atomic feature size
How so? The size of the silicon atom is 0.2nm, so we're already close to dealing with atomic scales. It seems likely we'll switch to an element like gallium, as a nearby comment mentioned, before we reach even 1nm node size.
At that point the CNT manufacturing process would have matured beyond the current experimental, but promising, stage.[1]
We can only speculate, but I reckon these changes will happen much earlier than 2050.
> So yes, you could get a GPU that perform close to a GTX4090 by 2033 and it cost you less than $200.
I sure hope we get to that point much sooner than that. :) NVIDIA and AMD are squeezing this performance at the expense of power, heat and size. Apple has proven it's possible to do this much more efficiently, but their silicon is still incredibly large compared to older generations. If we can get a better wafer yield with another material or process, while also reducing power and heat, then the only drawback for consumers could be cost, which should continue to go down. Though if NVIDIA has any say in the matter, they'll surely continue to price gouge consumers, as they've done this generation.
Because the current "5nm" tech doesn't even have any feature size that is a single digit nm. The number 200x is just some napkin maths. Based on a 40nm feature size and 0.2nm atom. So take it with some pinch of salt.
And before we even hit that, quantum tunnelling effect will happen sooner or later.
>Though if NVIDIA has any say in the matter, they'll surely continue to price gouge consumers, as they've done this generation.
It depends on how you view it. This is the first time Nvidia has introduce a GPU running on leading edge node. People often like to compare it to previous x090 pricing. But Nvidia has always used a mature node, and they are far cheaper both in design and wafer price. So getting a 4090, using 4nm, isn't really price gouge consumers as most mainstream comments and media likes to think it is.
I think we're getting pretty close to Silicon's limits, optimistically maybe 2-3 more node shrinks. After that, I think Gallium Arsenide is the next most accessible material which should have room for a few more nodes.
By then hopefully those more exotic technologies like carbon nanotube transistors will be a bit more viable.
I'm not much of an expert though, just recalling what was covered in one of my lectures.
I am not so keen on photon transistors, at least not in the short term. I have seen them proposed for decades but progress has been slow at best. Maybe some folks will figure it out and we will make that leap but current progress is not looking great. As for Carbon nano tubes, I'm not too familiar with the applications of this for transistors but it does sound like something that can get us a little further down the road.
All this arm chair speculation. All the people that work on these things would know far better than me. And I mean ALL the people! ;)
I agree regarding photonic transistors, I have a hard time seeing how those would scale down to even current transistor sizes/densities. Especially given that they'd probably have to rely on much shorter wavelength light, which comes with its own issues.
Carbon Nanotube transistors are fairly promising though, they're structured similarly to regular transistors but can have some neat properties like ballistic transport regimes (meaning zero resistance when enough current is going through), better heat dissipation and ability to handle much higher currents. IIRC they're also very tunable with the bandgap controlled by the shape of the tube.
Main issue with CNT transistors is just that - as with most things involving CNTs - we need production methods that are significantly more consistent with the shape and quality of tube they produce. We'd need to be able to print billions of tubes per chip with almost the exact same shape to compete with traditional semiconductors.
Do they make any 5nm chips other than apple silicon ones? What about the 3nm ones, will they all be apple? I would like to put one of those new chips in a normal pc.
Yes, they make a lot of 5nm and 4nm parts for Qualcomm and others. For example, the Snapdragon 8+ Gen 1 is based on TSMC's 4nm (the regular 8 Gen 1 is based on Samsung's 4nm). AMD's Ryzen 7000 series (Zen 4) uses TSMC's 5nm.
Apple tends to have the best margins and is willing to pay upfront for TSMC's capacity so they generally get it ahead of other customers. However, there are reports that Intel will be using TSMC's 3nm: https://www.windowscentral.com/intel-apple-tsmc-3nm-report. Intel has confirmed that it will be using TSMC's 3nm tech in its 2023 products, but hasn't said which products. Maybe we'll see Intel laptop chips at 3nm, maybe it'll be high-margin datacenter chips, maybe it'll be something for enthusiasts on the desktop, maybe it'll just be the GPU for chips made with Intel 4 (5nm): https://www.extremetech.com/computing/338679-intel-may-drop-....
AMD has said that 3nm will come with Zen 5c. The Zen 5 is set to launch in 2024 with the Zen 5c happening after that so most likely in 2025.
Zen 4 just launched at 5nm after Apple had been using 5nm for 2 years. It looks like Apple will be the only 3nm customer in 2023. Apple is not only selling a lot of phones, but they also need a follow-up to the M1/M2 processors (and the M2 is a very marginal improvement). I don't want to sound like I'm knocking the M1 because it's truly amazing to have an M1 laptop, but Apple is shipping a 2-year-old processor in all its laptops. While the Mac might not use as many chips as their phone biz, the chips are larger which means using up more of TSMC's capacity.
Again, it took AMD 2 years to get to 5nm after Apple launched their 5nm chips. AMD might get to 3nm faster than they got to 5nm, but I wouldn't count on it. Qualcomm is sticking with 4nm for their 2023 Snapdragons. Samsung's transistor density for their "3nm" 2GAE is basically equivalent to TSMC's 4nm (while TSMC's 3nm is nearly 50% higher).
If the question is "should I wait for something that's just around the corner," I'd have to say no. AMD does have a Zen 4c on its roadmap at 4nm, but 4nm can be a bit misleading since we're talking about TSMC improving density by around 6% (and Samsung's 4nm trailing TSMC's 5nm). So I wouldn't wait for "4nm" either.
There's always uncertainty around when new stuff will happen, but it'll likely be more than 2 years before AMD moves to 3nm. Remember: we aren't even at the point where anyone has launched a 3nm part that you can buy. Let's say that Apple introduces something in April for their Macs and then launches new iPhones in the fall at 3nm. That's going to be using up a lot of TSMC's capacity through 2023 and into 2024. Qualcomm will want their 2024 chips to be 3nm in Spring 2024. So maybe late 2024 or 2025 some 3nm capacity starts being available for AMD and they can launch 3nm Zen 5c then.
Officially, this is what we have from AMD:
Zen 4 - 5nm (2022) editorial note: this ended up being September 27, 2022
Zen 4 V-Cache 5nm (2023)
Zen 4C - 4nm (2023)
Zen 5 - 4nm (2024)
Zen 5 V-Cache - 4nm (2024+)
Zen 5C - 3nm - (2024+)
Could we see Zen 5c in 2024? Maybe, but it seems like a stretch. TSMC's 3nm seems a tiny bit behind schedule and given that Zen 5c is targeting "2024+", it just doesn't seem likely.
So, there won't be an update to 3nm for Zen 4 and even Zen 5 won't launch at 3nm.
Using the node too soon can be a bad idea too, becasue it's going to be overpriced and that will be passed to the end user. I think AMD's pace of using the node is reasonable. It's Apple who is trying to jump the gun and use it sooner than normal making it cost way more than it could otherwise.
TSMC charges way more for the newest node and then prices go down over time.
Oh, for sure. Apple pays a huge premium, but Apple also sells premium devices.
Most of the time, if you're buying the most high-end stuff, you're overpaying. Even within a processor family, when you buy the high-end, you're often paying a big premium for marginally more performance.
I'm not saying that AMD isn't being smart. AMD's main competition is Intel and Intel has been behind enough that it probably doesn't make sense to be paying the huge costs of a new node. However, that might change if Intel's fabs get back on track. Intel 4 is supposed to be launching in 2023 with Intel 3 following in 2024. If Intel 4 launches in 2023 and comes in at the estimated 202 MTr/mm2, that'll be 38% greater than TSMC's 4nm and 47% greater than Samsung's 4nm.
Apple is in the mobile space where power and space are at a premium and they're selling premium devices. While power does matter in laptops, desktops, and servers, it isn't quite as acute.
However, it's possible that AMD will need to become more aggressive if Intel is able to hit its roadmap. If Intel 4 chips hit the market in late 2023 with a decent advantage, that might not bode well for AMD who has been riding TSMC's fab advantage to an extent. Again, I don't want to sound like I'm knocking AMD, but when Intel was shipping 14nm and they were shipping 7nm, they had a 2.5x density advantage. It's looking like the days where AMD has such a large advantage in process will be gone. That doesn't mean they'll fall behind, but I think it does mean that they might need to adopt new nodes faster than they have.
I do agree that AMD's pace has been reasonable given previous and current market conditions. I just think that the market conditions seem like they're going to shift over the next two years. It's one thing not to adopt TSMC's newest stuff when Intel is years behind TSMC's stuff from a year or two ago. It's another thing when a 2023 Intel might be sitting between TSMC's 5nm and 3nm and you're on TSMC's 5nm and when a 2024 Intel might be matching TSMC's 3nm and you're looking to launch 3nm in 2025. But we'll have to see if Intel can pull it off.
I just hope it won't cause prices creep. CPUs and GPUs were already gradually getting more expensive for same kind of levels with recent generations and this rush to the newest node would only make it worse, unless increased scale of production will help lower prices.
The new process necessitates a completely new design. TSMC has to carefully work for a long time with AMD to design new chips that will work on this new "3nm" process, starting from basically the ground up.
So no, no refresh. But it could be used in the next or more likely next-next generation of their GPUs.
Yeah the distinction is leading vs trailing nodes. Automotive uses older nodes for cost and nobody is building any more of that. TSMC is trying to get them to move to 22nm nodes but so far that’s not what’s used, it’s 40nm and 65nm and 90nm and such.
Paradoxically there is actually some capacity available at the newer nodes now, because of big cancellations from major customers. But using them involves major design/validation costs and higher wafer costs etc, and the things that automotive and RF and power chips do doesn’t really scale.
> The term "3 nanometer" has no relation to any actual physical feature (such as gate length, metal pitch or gate pitch) of the transistors. ... a 3 nm node is expected to have a contacted gate pitch of 48 nanometers and a tightest metal pitch of 24 nanometers... "3 nm" is used primarily as a marketing term by individual microchip manufacturers ... there is no industry-wide agreement among different manufacturers about what numbers would define a 3 nm node
So can someone who is more familiar with it help me understand what progress we're making in terms of the physical aspect of the chips? Is transistor density still increasing? Are we using different materials? At a physical level what's different about this generation of chips?