Worth noting that Intel has dropped their "old" compiler and the newer "Intel" compilers are LLVM based. IMHO they will likely be pulling similar anti-AMD tricks with it and they are keeping their paid version closed source - which is allowed by LLVMs license.
RMS was right that compilers should be GPL licensed to prevent exactly this kind of thing (and worse things which are haven't happened yet).
On another compiler related note, I find it insane that GCC had not turned on vectorization at optimization -O2 for the x68-64 targets. The baseline for that arch has SSE2, so vectorization has always made sense there. The upcoming GCC 12 will have it enabled at -O2. I'd bet the Intel compiler always did vectorization at -O2 for their 64bit builds.
> RMS was right that compilers should be GPL licensed to prevent exactly this kind of thing (and worse things which are haven't happened yet).
The problem with this is that it wouldn't solve the problem in question: Intel would just have stuck with their old compiler backend instead of LLVM.
Besides, LLVM wouldn't have gotten investment to begin with if it were GPL licensed, since the entire reason for Apple's investment in LLVM is that it wasn't GPL. Ultimately, LLVM itself is a counterexample to RMS's theory that keeping compilers GPL can force organizations to do things: given deep enough pockets, a company can overcome that by developing non-GPL competitors.
> Besides, LLVM wouldn't have gotten investment to begin with if it were GPL licensed. The entire reason for Apple's investment in LLVM in the first place is that it wasn't GPL.
> The patch I'm working on is GPL licensed and copyright will be assigned to the FSF under the standard Apple copyright assignment. Initially, I intend to link the LLVM libraries in from the existing LLVM distribution, mainly to simplify my work. This code is licensed under a BSD-like license [8], and LLVM itself will not initially be assigned to the FSF. If people are seriously in favor of LLVM being a long-term part of GCC, I personally believe that the LLVM community would agree to assign the copyright of LLVM itself to the FSF and we can work through these details.
The reason people worked on LLVM/clang is that GCC was (and to some degree, is) not very good in various areas, and had a difficult community making fixing those issues hard. There's a reason a lot of these newer languages like Swift, Rust, and Zig are based on LLVM and not GCC. See e.g. https://undeadly.org/cgi?action=article&sid=20070915195203#p... for a run-down (from 2007, I'm not sure how many of these issues persist today; gcc has not stood still either of course, error messages are much better than they were in 2007 for example).
GPL3 changed things a bit; I'm not sure Lattner would have made the same offer with GPL3 around, but that was from 2005 when GPL3 didn't exist yet. But the idea that LLVM was primarily motivated by license issues doesn't seem to be the case, although it was probably seen as an additional benefit.
While clang is getting this support because it's not GPL, it's also providing a well deserved competition for GCC, and clang's presence woke the GCC devs to build a better compiler.
All in all I avoid non-GPL compilers for my code, but I'm happy that clang acted as a big (hard) foam cluebat for GCC.
In my opinion, we need a well polished GNU/GPL toolchain both to show it's possible, and provide a good benchmark to compete with. This competition is what drives us forward.
From what I saw of him, he never said that it is impossible to build non-GPL compilers, he said that the work free software developers do should not "help" proprietary software.
So yes, he basically said that if you want to develop a proprietary compiler, it should cost you, and not take GCC as a base to freeload. Intel basing their new compilers on LLVM clearly saved them effort.
The original AMD64 extension and associated ABI included SSE2, so vectorization was available on 64bit x86 systems from day one. I think that's at least 17 years if not 20. GCC will use it by default at O2 starting with their next release in about a month. Intel has contributed for a long time but I dont know that this stupidity could be pinned on them. It wouldn't surprise me though.
If Intel had shipped a library/compiler that did just use feature flags and didn't check the CPU vendor, and the resulting code used features that on AMD ran much more slowly than the equivalent unoptimized code, would people blame AMD for the slow instructions, or blame Intel for releasing a library/compiler that they didn't optimize for their competitor's processor?
> AMD processors before Zen 3[11] that implement PDEP and PEXT do so in microcode, with a latency of 18 cycles rather than a single cycle. As a result it is often faster to use other instructions on these processors.
There's no feature flag for "technically supported, but slow, don't use it"; you have to check the CPU model for that.
All that said, the right fix here would have been to release this as Open Source, and then people could contribute optimizations for many different processors. But that would have required a decision to rely on winning in hardware quality, rather than sometimes squeezing out a "win" via software even in generations where the hardware quality isn't as good as the competition.
When it comes to Intel, I am, and have been, so disgusted that they held back computing by about 6-10 years by consistently shipping overpriced, barely improved-upon quad-core processors that I:
1. Put nothing shitty past them.
2. Will never ever purchase their products again.
The real problem is the endless pursuit of profit though, instead of the pursuit of ever-advancing, ever-improving technological superiority, and sadly AMD isn't any better in this area I've come to see. The moment they conclusively, provably became better than Intel, they jacked up their price, even though their processors were using the same 7nm process that, at that point, was extremely reliable and had a 93% usable chip ratio.
So it turns out as soon as one company gains superiority they immediately become shitbags focused on money instead of focused on the advancement of technology and mankind. It puts anyone with a moralistic stance on what technology should be and how it should be implemented and distributed into a real pickle.
I was hoping that AMD would be the better company here, especially given they nearly died, but turns out they also are ready and willing to squander the goodwill of those of us who bought their chips not just when they were on the last legs, but also during their recovery period.
"So it turns out as soon as one company gains superiority they immediately become shitbags focused on money instead of focused on the advancement of technology and mankind"
That exactly why our economy is stagnating - over the past 30 years many major hard industrues have become uncompetitive oligopolies or cartels, and we have people defending this state of affairs.
Some digital industries are even outright monopolies
> The moment they conclusively, provably became better than Intel, they jacked up their price, even though their processors were using the same 7nm process that, at that point, was extremely reliable and had a 93% usable chip ratio.
Why wouldn't they have jacked up their price, if provably better ?
It's not like this happened on its own, without more effort from AMD (or less from Intel) ?
AMD didn't have the technical prowess to do it, that's why. And even if they had, the Wintel duopoly was so entrenched that even if their offerings were "nearly as good as" Intel's, they still wouldn't have been able to make headway because Intel was threatening OEMs like HP, Dell, etc.
If Jim Keller hadn't gone back to help AMD, and if Dr. Lisa Su hadn't decided to take on that challenge, we'd likely be stuck in an era of processor Dark Ages, OR, Apple and their Apple Silicon line of processors would be even more attractive than they already are.
Go back and benchmark your old Intel CPUs with security mitigations enabled. You'll lose 30-60% performance in syscall heavy and other common workloads while Bulldozers barely change.
Intel gained an unfair advantage and built their reputation by taking shortcuts with security. The FX series weren't marvels of design engineering, but they weren't nearly as behind the performance curve as customers were deceived into thinking.
This probably has a lot to do with the anti-consumer segmentation and processor locking that Intel implemented at the dawn of the decade- arbitrary socket changes every 2 years so you couldn't upgrade without buying a new motherboard, and locking overclocking behind a paywall being the two most egregious.
AMD's processors, while not fast, did none of those things and were cheap. I guess that buys you a lot of good will when Intel's still charging 300 dollars for CPUs that wouldn't beat a 2500K at 4.6GHz until several years down the line.
Maybe, but the people that drive the hate for Intel online all fall into the former category.
And "have to replace the motherboard along with the CPU" is the exact thing we're talking about here: there was no technical reason for Intel to make the earlier boards incompatible, they did it just because they could. Not that there was ever really a reason to upgrade beyond "buy the cheapest K series, set multiplier to 46-48x, done", but even if you wanted to, you couldn't.
It was an anti-consumer practice and said consumers never forget it (not that most tech channels don't provide active reminders of it). And those people are who everyone else asks when "I'm getting a new computer", they say "buy the competitor's product", and the rest is history.
That feature flag is merely a macro in the Linux kernel source code, and doesn't appear to be exposed to userspace. It is entirely different from the kind of flags under discussion, which are in the return values of the CPUID instruction and available for any program to query.
The enhancement applies to string lengths between 1 and 128 bytes long. Support for fast-short REP MOVSB is enumerated by the CPUID feature flag: CPUID
[EAX=7H, ECX=0H).EDX.FAST_SHORT_REP_MOVSB[bit 4] = 1
So there is indeed a CPUID feature flag for fast rep movsb.
So... ignore the cpuid bit indicating the instruction is present and look at manufacturer string, which is exactly what intel is being lambasted for here.
No, you don't ignore the CPUID feature bit. You use the CPUID to enable the feature by default, but override that default by blacklisting specific models that you know would have undesirable implementations.
Last time this came up on Hacker News I discovered SolidWorks 2021 was using an older MKL library that supports the MKL_DEBUG_CPU_TYPE=5 environment variable. I'm on an AMD cpu and measured a small solidworks fps and rebuild time improvement with the flag enabled
Multiple versions of MKL dlls exist in the install directory of Solidworks 2021. Indeed, the dlls supporting FloXpress and simulation seem to be the updated MKL version that no longer support the flag. However, the main executable only seems to call sldmkl_parts.dll. It appears to be MKL version 2018.1.156 that does support the flag
It would depend on the version of MKL. If Solidworks has (just for example) statically linked to or bundled in an old version of MKL, then it should work there, still.
The philosophy behind MKL is that each CPU vendor provides an MKL for their CPU. If you expect to mix and match MKLs and CPUs, you don’t understand the goals of MKL.
The expectation in the HPC community is that an interested vendor will provide their own BLAS/LAPACK implementation (MKL is a BLAS/LAPACK implementation, along with a bunch of other stuff), which is well-tuned for their hardware. These sort of libraries aren't just tuned for an architecture, they might be tuned for a given generation or even particular SKUs.
I learned about this recently when trying to optimize ML test architecture running on Azure. It turns out having access to Ice Lake chips would allow optimizations that should decrease compute time and therefore cost by 20-30%.
Each vendor. Intel BLAS (MKL) has Intel-specific optimizations and AMD BLAS has AMD-specific optimizations.
Intel is still acting in bad faith by allowing MKL to run in crippled mode on AMD. They should either let it use all available instructions or make it refuse to run.
The latest oneMKL versions have sgemm/dgemm kernels for Zen CPUs that are almost as fast as the AVX2 kernels (that require disabling Intel CPU detection on Zen).
Accelerate and MKL have some overlap (notably BLAS, LAPACK, signal processing libraries and basic vectorized math operations), but each also contains a whole bunch of API that the other lacks. Neither is a subset of the other.
They both contain a sparse matrix library, but exactly what operations are offered is somewhat different between the two.
They both have image processing operations, but fairly different ones. Accelerate has BNNS, MKL has its own set of deep learning interfaces...
Replying to [dead] sibling post from kxyvr: yes, Accelerate provides a Q-less sparse QR on Apple platforms (https://developer.apple.com/documentation/accelerate/sparse_..., in particular SparseFactorizationCholeskyAtA). I believe that MA49 from HSL does it as well, and may have more acceptable licensing than SuiteSparse depending on your situation.
For anyone shipping binaries to customers using the Intel compiler could well be considered negligent. Intel have made it clear they will secretly sabotage /your/ customers if you use their tools to make your product and in fact they have done so. They will secretly sabotage you if you aren't a "pure intel" shop. Those actions were and remain completely hostile.
In the light of "Reflections on Trusting Trust" [1]
"Intel cannot be trusted to supply your compiler at any price." That's a point of view that is a lot more than just "a reasonable one to hold." The reflection on Intel and their lack of reckoning having been caught out sabotagingyourcustomers is something any customer of Intel needs to consider - included in that assessment must be the expected value of the $$$ loss of purchasing from Intel. It's really not something anyone can responsibly ignore and fail to assess. Then go ahead and making your responsible and informed engineering and business trade off.
edit: The point being we all have a bar of "well they wouldn't actually do that" in a purchasing decision. That bar for Intel is dramatically lower as a result of this incident and failure to properly address it in full with a mea culpa and consequences rather than the ongoing minimum action required by the courts and damage limitation we've seen. It is very hard to see how the probability of them secretly sabotaging your goals could have gone down here.
That is what Intel think of their reputation and what they think they can get away with in their response to you.
Has anyone tried a recent version of MKL on AMD? I assume they were shunting AMD off into an AVX codepath because pre-Zen AMD lacked AVX2 (well, Excavator had I guess...).
If they are sending Zen down the generic AVX2 codepaths by default and those are competitive with, say, openBLAS, that seems reasonable, right?
Hopefully BLIS will save us all from this kind of confusion eventually.
What should they have done instead? Built a compiler with a "cripple Intel" function? So people would have to download the executable that's fastest on their CPU, even though they use the same instruction set?
The issue here is that they used a slower code path even on CPUs that could run the faster one, just because they were made by a competitor.
You say "AMD should have made their own compiler", but why? What else should they have made? An OS? An office suite? Why?
AMD should concentrate on making LLVM and GCC work great on AMD processors, by contributing the needed code. They are already making some contributions but could be doing more, and they could be funding experts to work on that and giving those experts the information they need.
It says they might as well stop dividing their effort and focus on the upstream LLVM (or alternatively treat their own version as just a development branch that they can push upstream from). While they have expertise on the details of their processor they may benefit more by cooperating with all the compiler experts outside their company.
To fix this problem AMD would have to work on making LLVM and GCC work great on Intel processors. That would be the only way to make people not use the Intel compiler for extra performance and ending up with binaries that are crippled for AMD. Clearly that's not a solution for this problem.
AMD's software offerings (e.g. look at uProf vs vTune) are functional at best. Intel's are much easier to use, have a lot more documentation, and actually make your life easier versus having basically just a firehose of data.
I think we can simply imagine a common scenario: some employee working for Company X, developing a compiler suite, and adding necessary optimizations for Company X's processors. Meanwhile, Company Y's processors don't get as much focus (perhaps due to the employee not knowing about Company Y's CPUIDs, supported optimizations for different models, etc.). Thus, Company Y's processors don't run as quickly with this particular library.
Why does this have to be malicious intent? Surely it's not surprising to you that Company X's software executes quicker on Company X's processors: I should hope that it does! The same would hold true if Company Y were to develop a compiler; unique features of their processors (and perhaps not Company X's) should be used to their fullest extent.
No, this was definitely intentional. Intel is doing extra work to gate features on the manufacturer ID when there are feature bits which exist specifically to signal support for those features (and these bits were defined by Intel themselves!).
If they had fixed the issue shortly after it was publicly disclosed it might have been unintentional, but this issue has been notorious for over a decade and they still refuse to remove the unnecessary checks. They know what they're doing.
The thing is: the bits to check for SSE, SSE2, ..., AVX, AVX2, AVX-512? They're in the same spot on Intel and AMD CPUs. So you don't need to switch based on manufacturer. The fact that they force a `GenuineIntel` check makes it seems malicious to many.
All browsers pretend to be MSIE (and all compilers pretend to be GCC). You'd think AMD would make it trivial to change the vendor ID string to GenuineIntel for "compatibility".
The CPUID instruction allows software to query the CPU on if an instruction set is supported. Code emitted by Intel's compiler would only query if the instruction set exists if the CPU is from Intel, instead of just always detecting.
AMD can choose to to implement (or not) any instruction set that Intel specifies, and Intel can choose to implement (or not) any instruction set AMD specifies, however, it would in 100% of cases be wrong to check who made the CPU instead of checking the implemented instruction set. AMD implements MMX, SSE1-4, AVX1 and 2. Any software compatible with these must work on AMD CPUs that also implement these instructions.
If AMD ever chooses to sue Intel over this (likely as a Sherman Act violation, same as the 2005 case), a court would likely side with AMD due to the aforementioned previous case: Intel has an established history of violating the law to further its own business interests.
I’m with you generally, but having written some code targeting these instructions from a disinterested third-party perspective, there are big enough differences in some instructions in performance or even behavior that can sincerely drive you to inspect the particular CPU model and not just the cpuid bits offered.
Off the top of my head, SSSE3 has a very flexible instruction to permute the 16 bytes of one xmm register at byte granularity using each byte of another xmm register to control the permutation. On many chips this is extremely cheap (eg 1 cycle) and its flexibility suggests certain algorithms that completely tank performance on other machines, eg old mobile x86 chips where it runs in microcode and takes dozens or maybe even hundreds of cycles to retire. There the best solution is to use a sequence of instructions instead of that single permute instruction, often only two or three depending on what you’re up to. And you could certainly just use that replacement sequence everywhere, but if you want the best performance _everywhere_, you need to not only look for that SSSE3 bit but also somehow decide if that permute is fast so you can use it when it is.
Much more seriously, Intel and AMD’s instructions sometimes behave differently, within specification. The approximate reciprocal and reciprocal square root instructions are specified loosely enough that they can deliver significantly different results, to the point where an algorithm tuned on Intel to function perfectly might have some intermediate value from one of these approximate instructions end up with a slightly different value on AMD, and before you know it you end up with a number slightly less than zero where you expect zero, a NaN, square root of a negative number, etc. And this sort of slight variation can easily lead to a user-visible bug, a crash, or even an exploitable bug, like a buffer under/overflow. Even exhaustively tested code can fail if it runs on a chip that’s not what you exhaustively tested on. Again, you might just decide to not use these loosely-specified instructions (which I entirely support) but if you’re shooting for the absolute maximum performance, you’ll find yourself tuning the constants of your algorithms up or down a few ulps depending on the particular CPU manufacturer or model.
I’ve even discovered problems when using the high-level C intrinsics that correspond to these instructions across CPUs from the same manufacturer (Intel). AVX512 provided new versions of these approximations with increased precision, the instruction variants with a “14” in their mnemonic. If using intrinsics, instruction selection is up to your compiler, and you might find compiling a piece of code targeting AVX2 picks the old low precision version, while the compiler helpfully picks the new increased-precision instructions when targeting AVX-512. This leads to the same sorts of problems described in the previous paragraph.
I really wish you could just read cpuid, and for the most part you’re right that it’s the best practice, but for absolutely maximum performance from this sort of code, sometimes you need more information, both for speed and safety. I know this was long-winded, and again, I entirely understand your argument and almost totally agree, but it’s not 100%, more like 100-epsilon%, where that epsilon itself is sadly manufacturer-dependent.
(I have never worked for Intel or AMD. I have been both delighted and disappointed by chips from both of them.)
I don't think you read the article. Go read it first before you make your hypothesis. If it was as easy to fix as using a environment variable (which no longer works) then it was done intentionally.
I don't think the fact that it can be enabled/disabled by environmental variable indicates malicious intent. It could be as simple as that Intel doesn't care to test there compiler optimizations on competitors' CPU's. If have to distribute two types of binaries (one which were optimized but could break, vs un-optimized and unlikely to break), I would default over to distributing the un-optimized version. Slow is better than broken.
I understand some end users may not be able to re-compile the application for there machines, but I wouldn't say its Intel's fault, but rather the distributors of that particular application. For example, if AMD users want Solidworks to run faster on their system, they should ask Dassault Systemes for AMD-optimized binaries, not the upstream compiler developers!
Anyways, for those compiling their own code, why would anyone expect an Intel compiler to produce equally optimized code for an AMD cpu? Just use gcc/clang or whatever AMD recommends.
The thing that gets me about Intel's culture, as someone who worked there, was that Intel as an organisation was completely unable to actually accept they'd done anything wrong. Ever.
There are lots of cases where Intel has either screwed up or done things that were unarguably anti-competitive. It happens at every company, I don't like Uber, but I'm not going to blame Uber today for the fuckery that Kalanick got up to.
In each case you could ask the Intel HR, or Intel senior management what they thought about it and it was never Intel's fault. The answers to any questions about this sort of stuff would be full of pettifogging, passsive voice, and legalese. The result was the internal culture was an extremely low trust environment since you knew people were willing to be transparantly intellectually dishonest to further their careers. I haven't been there since Gelsinger arrived but I hope that changes, I wonder how much it can change in the legal environment we're in.
I don't think this is dishonesty - it's auteur mentality. In Intel's view, AMD was a second-source vendor that went rogue, and gets to free-ride on their patents because Intel couldn't be arsed to extend x86 to 64-bit. If they had their way, they'd own the x86 ISA interface and all their competition would be incompatible architectures that you have to recompile for. Crippling AMD processors with their C compiler wasn't dishonest, it was DRM to protect their """intellectual property"""[0].
Gelsinger was the head designer on the 486, so he was around during the time when Intel was obsessed with keeping competition out of their ISA and probably has a case of auteur mentality, too.
[0] In case you couldn't tell, I really hate this word. The underlying concepts are, at best, necessary evils.
At the firmware / driver level, fully open specifications for high performance hardware is an impossible dream.
At best, detailed documentation is a lower priority item below "make it work" and "increase performance".
At worst, it requires exposing trade secrets.
Edit: It'd probably be more productive for everyone if we set incentives and work such that the goal we want (compilers that produce code that runs optimally on Intel, AMD, and other architectures) isn't contingent on Intel writing them for non-Intel architectures. (Said somewhat curmudgeonly, because everyone complains about things like this, but also doesn't really how insanely hard and frustratingly edge-case-ridden compiler work is)
On November 12, 2009 AMD and Intel Corporation announced a comprehensive settlement agreement to end all outstanding legal disputes between the companies, including antitrust and patent cross license disputes. In addition to a payment of $1.25B that Intel made to AMD, Intel agreed to abide by an important set of ground rules that continue in effect until November 11, 2019.
Customers and Partners
With respect to customers and partners, Intel must not:*
[...]
Intentionally include design/engineering elements in its products that artificially impair the performance of any AMD microprocessor.
AMD[1], NVidia[2] do "make" their own compilers. AMD is notorious for a "build it and they will come" mentality. Despite the fact that this hasn't worked. AMD needs to make it easy to adopt their hardware, and the way this is done is with software.
When they finally get to the point that their driver/libs are as easy to install as Nvidia's , it might be too late. I've argued this with AMD folks before.
The barriers to adoption need to be low. Friction needs to be low. They need to target ubiquity[3].
I was getting better performance out of the NVidia HPC SDK compilers, but then again, the old PGI compilers it is based upon (with an LLVM backend now), have always been my go-to for higher performance code.
I've got some Epycs and Zen2s at home here, and I have both compilers. Haven't done testing in recent months, but they've been updating them, so maybe I should look into that again. Thanks for the nudge!
Actually Nvidia bought the Portland Compilers
And Intel's Fortran compiler is (has been, now its backend is LLVM) MS's compiler via DEC/Compaq/and HP - MS Visual Fortran 4 -> DEC Visual Fortran 5 -> Compaq Visual Fortran 6 -> Intel Visual Fortran ;).
I'm not sure of what issue you have with my statement. For me, it is a painless download + sh NVIDIA-....run. I have mostly newer GPUs, though the 3 systems (1 laptop and 2 desktops) with older GTX 750ti and GT 560m run the nouveau driver (as Nvidia dropped support for those).
Its a 13 year old laptop, and still running strong (linux though). Desktops are Sandy Bridge based. The RTX2060 and RTX3060 are doing fine with the current drivers. I usually only update when CUDA changes.
But yeah, its pretty simple. I can't speak to non-linux OSes generally, though my experiences with windows driver updates have always been fraught with danger.
My zen2 laptop has an inbuilt Renior iGPU, and I use it with the NVidia dGPU also built (GTX 1660ti). I leverage the Linux Mint OSes packaging system there for the GPU switcher. I run the AMD on the laptop panel and the NVidia on the external display. Outside of weirdness with kernel 5.13, I've not had any problems with this setup.
My point is, that "single download" or apt command to you is a royal pain in the ass to maintain, and makes things like kernel hacking a royal nightmare, all for Nvidia to play stupid out of tree games with the linux kernel maintainers. Easy "for a subset of users" does not excuse going out of the way to create more friction where none need exist.
But I'm glad your preferred workloads are unaffected. That counts for something I guess.
In the past, AMD just straight up had horrible software.
More recently, AMD have been investing more in open software, probably with the goal that indeed, a community form and they get "leverage" / ROI for their investment.
On the flip side, Intel invest heavily in high-quality but jealously guarded and closed source software.
With this nuance, I'm not so sure it's clear cut which one is "acceptable," and it's an interesting ethical question about Open Source and open-ness in general.
AMD still has horrible software, compare cuda to whatever crap AMD thinks you should use. Truth is its even hard to say what their alternative is, not to mention how horribly poorly they support what is, or at least should be their second if not most important/lucrative target.
And Intel has sandbagged us with 4 cpu cores for ages, leading to software that isn't being optimized for more cores. Suddenly AMD starts pushing many cores with high single core performance and Intel magically turns hyperthreading on for lower tier cpus and starts putting out way more cores.
There's a variety of options that are available here, and I don't buy the argument that AMD's behavior is automatically unethical.
A. Company makes and sells hardware, and offers no software.
B. Company makes and sells uniquely featured hardware, and offers software that uses those unique features.
C. Company makes and sells hardware that adheres to an industry standard, and offers software that targets hardware adhering to that standard.
D. Company makes and sells hardware that adheres to an industry standard, then uses their position in related markets to give themselves an unfair advantage in the hardware market.
Of these, options A, B, and C are all acceptable options. AMD has traditionally chosen option A, which is a perfectly reasonable option. There's no reason that a company is obligated to participate in a complementary market. Option D is the only clearly unethical option.
Intel's legitimate course is to make their CPUs run actually faster than the competition, instead of tricking people into running slower code on the competition.
So it appears not only is this posting from 2019, but the most recent information they reference is 2010. This seems to be no longer relevant? I’d love it if submissions on HN had a small blurb from the author explaining why their submission is interesting/relevant.
There are 2020 updates around MKL (But you may be correct that that content is about 2019 MKL optimizations).
At any rate though, based on Intel's track record I think this content is still relevant and of value to engineers who don't have domain knowledge in compilers or work downstream.
I would love to see a 2022 follow-up from Agner Fog on this. He has work on C++ compilers as recently as 2021 so I'm sure he has recent real world info on the topic.
> After Intel had flatly denied to change their CPU dispatcher, I decided that the most efficient way to make them change their minds was to create publicity about the problem. I contacted several IT magazines, but nobody wanted to write about it. Sad, but not very surprising, considering that they all depend on advertising money from Intel.
Sorry to go on this tangent: but is capitalism so rotten that everything eventually corrupts? Here even outlets for discussion on topics of science and technology self-censure to maximize profit. So much for freedom of speech.
Strictly speaking, it's not about "capitalism", but journalists trying to get funding from anywhere else than directly their readership. (There used to be law proposals forbidding this, not sure if advertising was already on their radar back in the 1940's.)
Of course there would still remain the issue of self-censoring to avoid annoying your readership, not sure how you can deal with that...
RMS was right that compilers should be GPL licensed to prevent exactly this kind of thing (and worse things which are haven't happened yet).
On another compiler related note, I find it insane that GCC had not turned on vectorization at optimization -O2 for the x68-64 targets. The baseline for that arch has SSE2, so vectorization has always made sense there. The upcoming GCC 12 will have it enabled at -O2. I'd bet the Intel compiler always did vectorization at -O2 for their 64bit builds.