Programmable logic seems like a space that could be disrupted with a company that has fully open-source tooling around its hardware along with great documentation. FPGA tooling is in the dark ages compared to modern software development environments. If you found such a company I will be your customer and simp/shill/stan for life.
I agree on the 'user friendliness' of FPGA tooling, but SW Engineers (myself including) often make the mistake of thinking an FPGA toolchain is like a software toolchain and therefore everything is equivalent. These are the two main reasons that FPGA toolchains aren't as open as compilers (I'm not saying they're good ones):
* Processors have a defined interface ("instruction set") that you can implement a compiler for and have it keep working. The interaction between the FPGA tooling and the FPGA targeted it much more complicated. It's a bit like graphics card drivers and the hardware, where the line between the two is blurry and you have to consider them both together as the graphics system. This means that the 'cleverness' of the design is much more exposed to the toolchain and having that open source would give a way a lot to competitors
* Synthesising FPGAs is a LOT harder and more complicated than compiling code (and compiling code is pretty hard). There are many stages in the pipeline involving synthesising the logic and laying it out on the target for best performance. To give an example, we upgraded our toolchain for a Xilinx FPGA and power consumption improved by 20% because they'd changed the ways that they segment clock domains and gate off stuff being used. Good performance in the tooling is a major differentiating factor that you might not want to give away.
That point about toolchain complexity just reminds me of where compilers were 20 years ago. You could get significant speed bumps because one vendor had an optimisation another didn't. That's probably still true for things like the intel compiler, for all I know. I do remember at one point doing computer vision stuff before NVidia ate the world that it wasn't uncommon to come across code that expected to have access to intel-compiled maths libraries to be able to run in real-time.
The point is that the free toolchains got good enough that in general you don't need to care, and I think it's reasonable to expect the same dynamics to apply here.
The difference is that software compilation is a fairly local optimization process - allowing you to change/inline one bit of generated code without significantly affecting the performance of the rest of the generated code - address space is cheap. On the other hand, especially when the fpga approaches capacity, one change could cause a significant portion of the design to have to be rerouted as space is limited causing significant performance differences or not meeting the timing requirements anymore.
> The difference is that software compilation is a fairly local optimization process - allowing you to change/inline one bit of generated code without significantly affecting the performance of the rest of the generated code
Not really.
The uop cache, and L1 code cache, of modern chips is rather small. You can often grossly increase performance locally by loop-unrolling, but if that causes the "hot path" to no longer fit in uop-cache (or L1 cache), then you've lost a chunk of global performance vs a small local-gain in performance.
Global vs local optimization is just a tough subject in general. Even on CPUs (which is probably easier than FPGAs)
OP wrote "a fairly local optimzation". Compared to what a placement algorithm must do for an FPGA, your example is still exactly that, and one that's relatively easy to get right with a few heuristics.
Yep. But to me that only changes what the problem is that the toolchain solves. It doesn't affect the social and economic pressures on how the relevant toolchains evolve.
To a certain (admittedly limited) degree the "limited space" problem with FPGAs maps intriguingly to the "limited memory" conditions that early LISP and FORTH compilers spawned in.
More to the point the market for FGPAs is fundamentally limited by some factors.
If you are making 10,000+ units the economics are overwhelmingly in favor of ASIC over FPGA.
(In particularly if you are the first mover that proves the market with an FPGA you should have started an ASIC design in parallel to it because the second and third movers will have ASIC from day one and a cost structure 10x or 20x better than you!)
This keeps FPGA a niche market because it serves niche markets.
> The point is that the free toolchains got good enough that in general you don't need to care, and I think it's reasonable to expect the same dynamics to apply here.
No the question was: would the vendor want this?
I'm referring to this comment above:
> Programmable logic seems like a space that could be disrupted with a company that has fully open-source tooling around its hardware along with great documentation. (...)
Not company supported, though, just reversed and company-tolerated. The investment was made by brilliant people donating their personal time, not the company funding an initiative - big difference there.
I think a truly "open" documentation and compilers model (like what AMD have been up to with their graphics cards) would still be a game-changer in the industry.
I honestly don't think opening up documentation would change a lot.
The medium hard part of FPGA tooling is synthesis, and the truly hard part is high performance placement and routing with low run times.
The competitive benefits of being good at that are huge, because it can literally mean the difference between choosing a component of one vendor vs the one of another. As a user, the tooling and the underlying HW architecture can be mostly treated as an inseparable blob: it's not as if you're going to design specifically for a particular FPGA logic element feature, you rely entirely on what the backend tool with do with it.
If I were Intel or AMD, I'd be fine with opening up some documentation that helps users to a certain extent, but I'd never agree to an open backend model.
With respect to opening the backend: it's also unlikely that your competitor is going to use your PnR against you, between patent encumbrance and backend silicon differences, no? This is the exact same argument as in graphics where AMD have done fine open-sourcing their compiler backends, nobody is going to cut-rate clone their silicon and NVidia are staying far, far away from any of the methods+processes there if they are smart.
With that said, I don't understand the FPGA market well enough to know if open tooling would be an advantage competitively.
With respect to opening documentation:
I really do wonder. I don't know the market well enough. For a "we need an FPGA because we need XYZ to be fast and don't have the volume/$ to tape out," of course a good PnR and meeting timing is the #1 concern, and a savvy buyer would choose the best performing solution even if the software suite is brutal to use. But for "we need a PLD for some glue logic, it doesn't need to be 100% blazing fast, it needs to meet XYZ requirements," would a buyer choose a toolchain that worked better and let their engineers go to market faster, even if the PnR wasn't quite as efficient? In that case, opening up documentation and letting the open-source ecosystem build out support could be valuable.
What would be telling here would be whether Lattice have seen a measurable sales impact from the reverse engineered open toolchain appearing for their parts. I cynically think that perhaps the idea is futile in this industry, but that's also what everyone thought about graphics and AMD went and did it anyway...
I'm a bit skeptical on this one. Open tools may help, but for full vendor support you are going to have to use their tools.
Also I wouldn't say that their tools are in the dark ages- most of the problems with their tools stem from modern software tool design. So for example, I find Vivado / SDK to be complex yes, but also much more pleasant to use than tool chains for simple micro-controllers. An example is the total mess that is software libraries delivered via STM32CubeMX (no part migration, libraries depend on custom BSPs vs. auto-generated code, good luck preserving your fixes if their auto-generated code has bugs).
A reason that the Xilinx tools are big and complex is that they are following modern software tool practices (that I hate): it's a large Java-based tool that aims to hide the underlying relatively simple command line programs. Xilinx tools from the mid-90s are actually closer to what the open source FPGA tools look like today.
While AMD's software is historically trash, they do have a track record many years long now of publishing ISA / microcode documentation, which is still a step ahead of FPGA vendors (as far as I know, all known bitstream formats are reverse engineered, not first-party documented).
Meanwhile, support for monitoring their Zen CPUs was backed out of the kernel in December because the interface is undocumented and buggy, and AMD isn't helping.
AMD is a Big Company, could go either very right or very wrong.
Looking at their open-source Linux graphics stack, if they ever end up integrating FPGA fabric into their chips as a kind of dynamic accelerator (I remember reading about that being one possibility after the acquisition), I wouldn't be entirely surprised if they did the same for that. Still probably not a fully FOSS FPGA tool suite, but at even drivers and such would be an improvement over what I've heard the current state of the industry is.
Looking the pretty spectacular failure that was AMD "Fusion", I wouldn't hold my breath on succesful FPGA integration.
Fusion was the name AMD gave pretty soon after ATi acquisition to their fancy HSA concept of accelerating compute with gpgpu, especially with the integrated gpus in their APUs. It was announced with great fanfare and pomp, but never really became reality. Today their gpgpu framework (rocm) is both unpopular and doesn't even support igpus in apus.
Almost certainly not; I'd be extremely surprised if AMD rocked the boat at all here. Xilinx is already the dominant player in the spaces they're batting for (datacenters are where all the money is and their major focus for several years now), they won't feel pressure to do such things.
Overall it seems like a move to strengthen their portfolio for integrated high-end solutions, not something they want to radically shake up. So I suspect it'll be business as usual at Xilinx after the acquisition, but maybe one day we'll actually get a mythical datacenter-class FPGA/CPU combo using Epyc, if it makes sense for their customers.
That was my hope as well, I still dream of Xilinx or Intel deciding to 'LLVM' their toolchain down to bitstream, I keep looking with a lot of interest at Lattice and the Yosys project as it is maturing.
These companies make money selling hardware, just open the tools and let people use what they want instead of forcing this 'visual programming' paradigm with half-baked IDEs on everyone.
My worry with AMD is what will happen to the SoC chips that have actual ARM cores in the fabric (like the Zynq family)?
Visual block diagram editors like the one in Vivado, Libero, etc are extremely useful and powerful in practice and an equivalent alternative for open source EDA design would be very welcome. And I'm a person who absolutely loathes Verilog, so it's not like I'm a purist or anything. If that's your representative example of your issues with proprietary FPGA software, I honestly don't think it's a very good one.
> 'visual programming' paradigm with half-baked IDEs
The paradigm is the least important, when it works, I don't know anyone in real world projects that didn't have to configure something manually through a tcl script because it was either not exposed in GUI or not working in GUI.
My issue is not with 'visual programming' my issue is with the 'half-baked' part of the tools.
There are too many multi year old bugs that go acknowledged by Xilinx. The VHDL2008 standard came out more than a decade ago and from memory it was only in 2019 that they supported it as a valid simulation file in Vivado. There a plethora of autogenerate/copied/cached files that add a ridiculous amount of friction to any attempt of sane version control. There is just too much complexity in what I assume is an attempt to hide the shoe-horned messiness of the tcl backend and 'prettifying' the GUI frontend.
> The paradigm is the least important, when it works, I don't know anyone in real world projects that didn't have to configure something manually through a tcl script because it was either not exposed in GUI or not working in GUI.
Worse is the other way around, a GUI control not exposed through the TCL scripts, which makes it very difficult to maintain a TCL build script which can be version controlled (unlike the project files: this particular IDE likes to completely rewrite its project files in a way which is not at all amenable to diffing).
I somewhat agree, but I think the number 1 factor of low FPGA adoption is the absurd price. Put a good price tag like in GPUs and you have a new industry. Of course, there are issues with tooling too. Maybe there needs to be a simplified API, akin to OpenGL, that takes care of the boilerplate.
And the FPGAs need not even be as efficient in packing as Xilinx or Altera.
The thing that always gets me: you are already charging for HW - that's the only real business model. The CAD tools 1) suck, 2) aren't going to be a major money maker, and 3) desparately need UI design and programming talent well beyond what ever gets applied today.
> FPGA tooling is in the dark ages compared to modern software development environments.
Dear god is it ever. My fiancee, bless her, got me a Spartan 6 for Christmas one year. I never got around to using it because I couldn't find a download for the single, outdated version of (I think) Arduino Software it claimed it 100% required.
I voted "no" on this. I didn't want to see Xilinx spend years working on getting acquired instead of working on new chips (which happened to Altera), and I thought that the acquisition price was too low given the current premium that AMD stock carries.
> [...] spend years working on getting acquired instead of working on new chips (which happened to Altera) [...]
I'm interested in what you mean by this. I was working at Altera at the time the acquisition went through, on the software tools mind you, but I didn't notice a significant shift in strategy. Are you referring to before or after the acquisition was announced?
The move to the Intel foundry cost Altera a lot - I don't know if it was to position themselves to be acquired or if it was for some other strategy, but it was a very expensive decision. Stratix 10 on TSMC may have landed at the same time as Ultrascale+ from Xilinx.
Generally, during and after a merger, there is a lot of instability in a company as layoffs in comparable departments get figured out and company infrastructures get merged. I assume that on the software side, you were insulated from this instability since they were going to need to keep developing Quartus (or its replacement) either way and Intel had no equivalent.
I was an Altera customer before and during the merger (including after it was announced). It looked from the outside like sales teams and chip design teams had significant amounts of instability. I personally experienced months of delays getting Arria 10s (and associated FAE support) from our sales team immediately after the acquisition. Our new sales team from Intel (figured out 8 months later) was twice the size, had to push CPUs and Intel's software in addition to FPGAs, and was also dealing with the embarrassing situation around Stratix 10 delays.
I see. My understanding was that part of the purpose of the deal to switch to Intel foundry was to set up for acquisition, so I think you're probably right there, but I'm not sure what the other strategic reasons were, if any. That move was 'fait accompli' by the time I joined. I've since moved on, but it's interesting to hear an outside perspective.
The delays surrounding Stratix 10 were indeed embarrassing, but I'm not sure how much of that was due to the switch to the Intel foundry vs architectural features new to S10.
I can't speak at all for sales/FAE support, but I guess that was a target ripe for "synergy", or w/e they call it.
Most of your point was about problem with Intel Fab though. AMD Xilinx doesn't have any of these problem, not to mention they shared the resources and tool set on TSMC. Which is a net positive.
Basically if AMD dont do anything to Xilinx, leaving it to them continue on their own, both Xilinx and AMD will gain benefits on design cost reduction. Which is important considering leading edge design cost is forever increasing.
>spend years working on getting acquired instead of working on new chips (which happened to Altera)
That's interesting, can you expand on this? I'm curious what could have impacted them that much (I'm a student about to finish my undergrad in comp eng)
I wonder if we'll ever see FPGAs in consumer CPUs, given AMD's expertise with chiplets and interconnects. Say, for programmable specialized instructions. The relatively recently published patent "Method and apparatus for efficient programmable instructions in computer systems" (https://www.freepatentsonline.com/y2020/0409707.html) seems to point that way.
This is an idea that's been around forever and from a hardware perspective it's super easy. The problem is finding applications for this. You need to find something that is so difficult that it requires atleast dozens of instructions (because the FPGA isn't going to be running at the CPU clock speed so it needs to be a decent chunk of work to jsutify), is done often enough to dedicate silicon for (the FPGA can't be dark 99.999% of the time), but not often enough that you can't justify full custom implementation. Then you need to write a set of custom instructions that map to this logic, and build support for using these custom instructions into the compiler - accounting for the fact you don't just need to dispatch data to the fpga, you need to program it each time you change instruction and that takes forever in CPU terms.
It's not impossible, it's just very difficult, impacts every single part of the stack, and is very difficult to justify.
Correct but at the same time that overstates things, because extremely often the choice between full custom ICs versus FPGA is dictated by the expected volume of product, not just functionality considerations.
It is typically simply cheaper to deploy FPGAs in released products when the volume is small, while it may be cheaper to use full custom when the volume is in the millions to hundreds of millions, in the cases where either solution is functionally workable.
That includes amortizing the non-recurring engineering costs over the total units, which is typically higher for full custom than FPGA -- although sometimes they are actually in the same ballpark.
Aside from that you are correct; people sometimes imagine that most any application can be significantly accelerated with FPGAs, but even in the cases where fine-grained parallelism is present to be accelerated (well-known not to be the case for all application areas), the FPGA solution space is decreased by the solution space where full custom makes engineering and financial sense.
Maybe they're targeting a different arch layer? Is there any mileage in pushing that sort of tech into on-chip routing? As you get more and more cores, obviously interconnect area becomes more of a problem (and bus bandwidth more constrained). Is there much to be gained from a compiler being able to say "This next bit of code wants as much uncontended bandwidth as you can muster between 5 cores and L1"? That way, actually reconfiguring anything would be a bunch of microcode, rather than something the compiler took direct control over.
There's some (perhaps lots) of potential for this for in-memory analytic processing (e.g. in an analytics-focused DBMS).
Also, a specific potential use of FPGAs is for pattern matching on large amounts of text/data: If you do it at all, you're likely to do it often; and it can't be a custom implementation since the circuit depends on the specific pattern.
Intel at one point tried an FPGA-in-package with a server CPU. It turns out that you can't downclock/power gate parts of FPGA designs nearly as easily as you can CPU components, so the whole package had thermal regulation issues. You had to run the CPU at a very conservative frequency in order to let the FPGA use power as it needs to.
There are FPGA-SoCs out there that are mostly FPGA, and that seems like the way to go if you want to combine FPGAs and CPUs, otherwise they should probably be on separate power and heat budgets.
Wondering the same; from my 10,000 ft view a cpu-fpga marriage seems like the kind of thing that could transform how performant general purpose computing is done.
The same was said back when AMD started making APUs (CPUs with on-board GPUs) and I was similarly hopeful, but we saw how that went. Most attempts at heterogeneous computing have gone only as far as sticking two previously separate chips in one package - no significant integration or co-operation. Since FPGAs are even more different from CPUs than GPUs are, even more programming effort will be needed to take advantage of them, so unless AMD make some kind of revolutionary framework for programming them, I fear FPGA acceleration will get even lower use than GPUs do (excluding 3D work, of course).
APUs have come a long way since the initial ones. AMDs own are now actually decent in 1080p gaming. I built my wife a gaming pc and didn't buy a dedicated gpu, just an AMD APU and can play CS GO maxed out at decent fps.
I think we can all agree that APUs have come a ways since first coming out. The poster you replied to was more lamenting how the inclusion of a GPU into my CPU didn't create something bigger than the sum of its parts.
What we have is still a CPU and a GPU. Separate. Discrete. The inclusion changed nothing about what kind of software we can write or problems we can solve. Back when APUs were still just a slide in a powerpoint deck, there was a lot of talk about how this could "change everything" because of the radically different type of compute you could do on a GPU.
The closest thing to this is the coin mining craze, but everyone is gravitating towards the fastest hardware for that. And that's not your APU.
Yeah, as the first sibling comment said, I was more talking about the new generation of computing APUs were supposed to bring us, instead we got GPUs that happen to be on the same chip as the CPU. AMD's APUs are definitely awesome, especially the Ryzen+Vega ones, and if they made a Ryzen 7+Vega 10 chip for desktop, I'd buy it immediately, but it's still just a GPU that does standard GPU things - the fact that it's integrated is little more than an implementation detail.
Seems doubtful. I thought that way too in the '90s but what happened was that anything that was a common enough use case got moved into dedicated hardware (GPU, cryptographic acceleration instructions in the CPU) and anything uncommon is too hard for the average developer to write the Verilog/VHDL to implement.
We could but then we need software to take advantage of it, which means FPGA are like GPU in the early days without Direct X or OpenGL drivers sitting on top. Someone ( Microsoft ) will have to create a something that work across on both Xilinx and Altera.
Although I dont see that happening soon. Mostly because no one has the interest in doing so. What most likely will happen is that Apple are so far ahead in this game they started to take notice and work on something.
On Server it is very different though. Tiles and Chiplet will finally means FPGA can have their own node and optimisation while sitting next to CPU without all the complexity of trying to get them into the same silicon. Although the cost of specific software optimisation for specific task will likely means they only goes to HyperScaler.
To add to the other responses, I believe the end goal of "programmable specialized instructions" is being successfully explorer by other options than FPGA, ie the Raspberry Pi Pico state machines, for one example. Then there are also the hybrid CPU-FPGA models (or FPGA SoC) mentioned in the other replies, like the Zynq.
I just hope they won't drop any of their CPLD chips. It would be a shame if only way to set up a simple logic circuits without having to use 74xxx was to use FPGA.
CPLDs also have the advantage of being non volatile. That can make your circuit simpler as you just add a socket and use an external programmer. FPGAs require external storage and more.
Intel picked up Altera in 2015, and now AMD picks up Xilinx in 2021. I'm curious to see whether AMD plans to do anything significant with this aquisition, or is it just executives following a path well worn?
My guess was the latter, given how high AMD's market cap is currently. However, Xilinx/AMD have some common technology requirements around 2.5d integration and 7 nm IP.
I was hoping for an FPGA-SoC with some Xeon cores when Intel bought Altera, and now I'm hoping for one with Epyc cores.
Could be interesting... I may be greatly misunderstanding something like Apple's M1, but I think some of its performance gains are due to offloading tasks from the CPU to dedicated ASICs. That's great when you know what those tasks are going to be, and there's no reason to abandon that for known tasks. But if AMD can put as certain amount of FPGA capacity on chips then it might gain the flexibility to dynamically increase performance by offloading from the CPU to purpose-configured FPGA units in the field already, and gain performance that might be better than running it alongside everything else on the CPU, even it it's not quite at the ASIC level.
I fully recognize that I may be speculating out my *s here, and would welcome further constructive comment even if it's just to say "Um, yeah, that's not how it works".
The biggest problem with this concept is that while ASICs can be extremely efficient, FPGAs are much less efficient than the equivilent ASIC. The flexibility comes at a substantial power, chip area, and speed cost. So much so that e.g. raw number crunching is more efficiently done by CPUs or GPUs in almost all cases. You need a particularly quirky computation before an FPGA is a good accelerator. FPGAs are more naturally suited to applications where ultra-low (or ultra-predictable) latency, extremely high bandwidth I/O (with relatively little processing), or particularly specialised DSP is required. Most of these are best served with an FPGA with a CPU attached, as opposed to the other way around, and the cost of the FPGA is not likely to be worth the cost to the majority of users. For an example of a specialised use-case: ASIC designers use racks of them to simulate the digital logic in large-scale designs, which are otherwise far too difficult to simulate on a CPU because CPUs really struggle to simulate billions of seperate logic elements individually, and latency is a real killer for parallel processing. Even so, they run much, much slower than realtime.
Probably worry about the future of access and direction for FPGAs. Most acquisitions of FPGA companies resulted in worse products afterwards. I'm probably suffering from confirmation bias here.
Is there any example of a FPGA provider delivering better products after acquisition?
EDIT: I'd wager that if the last administration hadn't blocked the sale of Lattice it would now be dead.
From my current perspective Actel was ok for its time. I worked with APA150 FPGAs in the past and for that ancient time it was ok. I heard, Microchip has cool SoC with RISC-V nowadays. I just don’t have time to play with them after work.
Uh at the high end there is Achronix and then the lower end you have Microchip(ex Microsemi(ex-Actel) and Lattice, Efinx and there are a couple of Chinese startups.
I don't have stocks in either companies, but I can see why a Xilinx stockholder might object.
From the AMD side of things, it's just expanding their offerings, and maybe bringing in some know-how and technologies that could help boost/extend their existing products.
If you have Xilinx stocks because you believe in what they're doing and predicting a bright future and possible dividend while at the same time don't see much of a future in the AMD64 platform, then why would you want AMD to the company.
Most stockholders don't care about the companies they invest in any more. It doesn't matter if they make FPGAs, cars or toaster ovens. What matters is that you can make a profit buying a stock now and selling it at a higher place down the road. Those kinds of stockholder won't block an AMD takeover if the price is right.
> What matters is that you can make a profit buying a stock now and selling it at a higher place down the road.
Making a profit has been the point of buying and selling stocks literally since day one, when the Dutch East India Company issued publicly tradeable shares in 1602.
If you want to show your support for the vision of a company, buy their t-shirts. If you're buying their shares (typically not even from the company itself, but from another shareholder) for that reason, you're doing it wrong.
amd64 isn't going away any time soon. Intel is still the largest player here. If I were Intel right now, I'd be trying to come up with the next iteration of x86, probably 256 bit.
There don’t seem to be many advantages to words larger that 64 IMO (they do exist, but things like GMP exist for that). If the purpose would be parallel processing (a la SIMD), we already have “256 bit x86” with AVX2. IIRC, Intel’s x86 chips can process 256 bits of integer data at the same time. And if you want 512 bits, AVX-512 exists as well (but it’s implemented internally as two sequences of 256 bit data)
For the same reasons that the US approved mergers of European and sometimes Chinese companies. If they have fairly major operations in China then China can get involved.
I'm not sure if either AMD or Xilinx has sufficient operations in China to justify it, though.
"The companies had been waiting nearly two years for their deal to clear global regulatory hurdles. The takeover had been approved in eight other jurisdictions, including the European Union and South Korea, since it was announced in October 2016. China was the lone holdout. "