Intel Initiates EOL for the VCA2: Three Xeons on a PCIe Card

theandrewbailey · on May 12, 2020

Full blown Xeons on a PCI-e card?

Lately, I've been wondering as to why we don't see computers physically built more like old workstations, with processors on cards plugging into I/O backplanes. The modern incarnation might be CPU+RAM on a PCI-e x32 card, and a simple "motherboard" with power and PCI-e. (Does 32 lane PCI-e exist? I've never even seen photos.)

chx · on May 12, 2020

Look up the Intel Intel NUC 9 Extreme (aka Ghost Canyon), it's a very small form factor gaming PC in the arrangement you are talking of. The motherboard is one PCIe card, the graphics card is another with a dumb I/O plate connecting the two.

As a concept it's interesting, however the 238 x 216 x 96 mm size is just too big when the Velkase Velka 3 is 227 x 187 x 97 and uses an off the shelf ITX motherboard, same PSU and graphics card to achieve the same for much cheaper overall.

A very powerful and quiet flex ATX PSU can be had at https://www.sfftec.com/product-page/enhance-enp-7660b-pro so that's not a problem either. Same can be had as taobao item 533119895425, use a proxy like superbuy and note modular PSUs with the modular plugs sticking out are longer than the flex PSU standard so not compatible with the Velka 3 but there are plenty variants of this item which are.

LargoLasskhyfv · on May 12, 2020

You see this all the time in industrial settings, based on https://en.wikipedia.org/wiki/CompactPCI which looks like this https://www.kontron.com/products/boards-and-standard-form-fa...

Some discrete GPUs in laptops/notebooks but also rackable systems are using it when it is pluggable/"built to order" based on a common board and not hard-soldered on.

banana_giraffe · on May 12, 2020

The Intel NUC Compute Element[1] is somewhat like that. I have no reason, however, to think this will last beyond this generation of the NUC.

1: https://www.intel.com/content/www/us/en/products/boards-kits...

digikata · on May 12, 2020

See standards proposals for various next gen links.

https://www.openfabrics.org/images/eventpresos/2017presentat...

dogma1138 · on May 12, 2020

More like 3 iGPUs with some extra baggage the CPUs on those aren’t accessible directly in any way and not used for what needs to be offloaded from the fixed pipeline encoders/decoders and even that might be offloaded to the main CPU.

theandrewbailey · on May 12, 2020

Not directly accessible? There's ethernet ports on the card, and the article mentions you can SSH into them.

dogma1138 · on May 12, 2020

The SSH is used to manage the images running on the VCA’s you aren’t adding CPU’s to the host.

rbanffy · on May 12, 2020

They aren't beefy Xeons - look more like desktop-like E3's with some HBM. Still nice, though.

Never saw one of these, but they may present themselves as one or more computers the same way a Xeon Phi coprocessor did.

dogma1138 · on May 12, 2020

They aren’t using HBM they are using DDR4 SODIMMs.

They don’t present themselves as CPUs to the host, they run a self contained Linux image the only thing that is accessible to the host is the streaming interface.

close04 · on May 12, 2020

I think OP was referring to the on-package eDRAM. It helps quite a bit with bandwidth and latency.

rbanffy · on May 12, 2020

> they run a self contained Linux image the only thing that is accessible to the host is the streaming interface.

That's more or less how the Phi coprocessors presented themselves, with the exception you could ssh into the coprocessor and install software on it.

pjmlp · on May 13, 2020

Because other than an a few niche markets many are now using workstation level laptops and stuff like M2M or PCMCIA never got much adoption.

darknoon · on May 12, 2020

"AVC transcoding at 30 FPS" sounds a lot less useful than what you can get from an NVIDIA consumer or Quadro card via NVENC. Very weird, and multiple Xeons doesn't sound cheap.

Am I misunderstanding the product?

I know Intel is coming out with a discrete GPU, which will probably have plenty of video encode hardware to compete with NVIDIA (esp since game streaming at 4K is quite popular).

dogma1138 · on May 12, 2020

It can do 40+ 1080p streams in real time per “card” and Intel provides the software.

cosmotic · on May 12, 2020

Each card has 3 E3-1585L CPUs, each with 4 cores (8 threads).

Intel's white paper (https://www.intel.com/content/dam/www/public/us/en/documents...) says up to 12 per CPU.

The card uses 235W.

A consumer security DVR can record 16 channels at for $100.

I found a TI chip from 2011 that could encode 6 streams at 1080p30, about 10W for $100.

The biggest advantage I see, as you pointed out, is the fact intel did all the software work and it's practically drop in.

dogma1138 · on May 13, 2020

Quality is a big factor especially for broadcast bitrates, QuickSync has superb quality even at low bitrates better than even Turing NVENC which is top notch and much much better than Pascal.

I doubt the Texas Instruments chip you found can do transcoding, it probably can only encode/decode.

44 streams of 1080p30 for h264 to h264, ofc you can select any resolution and frame rate you like.

https://www.intel.com/content/dam/www/public/us/en/documents...

Lammy · on May 12, 2020

Gamers everywhere wish you wouldn't consider 30FPS "real time" :)

celeritascelery · on May 12, 2020

This wasn’t for gaming

cosmotic · on May 12, 2020

Regardless, calling 30 FPS real-time is still pretty misleading.

dogma1138 · on May 12, 2020

It’s not it can transcode upto 44 streams of 1080p@30fps in real time as in there is effectively no delay which makes it suitable for broadcasting live video.

Real time means that there is no delay between frames going in and going out.

Lammy · on May 12, 2020

> Real time means that there is no delay between frames going in and going out.

As long as you don't put in a >30FPS source. This mindset seems prevalent throughout the modern broadcast industry, as evidenced by most modern "web" releases of old TV shows being in 30FPS progressive with the misunderstanding that it's somehow equivalent to 60i despite throwing away half the motion information found in the interlaced fields. Take an old Simpsons (or whatever) DVD, run an episode through QTGMC to get 60 progressive frames, and compare it to the same episode on a streaming service. You'll see what I mean.

dogma1138 · on May 13, 2020

That has nothing to do with anything discussed here there is no hard frame limit you can do 240fps if you want you just cut your number of maximum streams.

kn0where · on May 12, 2020

“Real-time” has nothing to do with frame rate and everything to do with latency and simultaneous throughput.

wmf · on May 12, 2020

This card is old (maybe Quick Sync was better than NVENC in those days) and Intel will always recommend their own chips over a competitor.

rubyn00bie · on May 12, 2020

This thing is sweet, and I wish the idea was still around, but it's obvious these specific cards are awful now and just produced as a symptom of Intel fucking the market on core counts.

Tangentially, I've been asking for a modular mac pro for years-- and this card is closer to the current Mac Pro as far as right ideas go.

calaphos · on May 12, 2020

How did these work? The article says you could ssh into them, so they all had their own RAM and ran their own individual os? I don't think the CPUs were multi socket capable.

How did they boot? Network? So the pcie connection was essentially just for networking?

eqvinox · on May 13, 2020

Google: PCIe Non-Transparent Bridge

they're independent systems with a special PCIe switch that looks like an end device to all participants and allows shuffling data around at PCIe performance

(the NTB function is integrated on some Intel processors AFAIK, but it may also be a separate chip)

robotnikman · on May 12, 2020

This is what I'm curious about as well. My poorly educated guess is that maybe these are just meant to process specific tasks sent to them from the host machine over SSH.

But then why have connectivity through both Ethernet and PCI-E?

exabrial · on May 12, 2020

If one needed more cpus, the advantage of "just more servers" vs this card I'm guessing is the extraordinarily lower latency and bandwidth available via PCIe vs some other interconnect?

dogma1138 · on May 12, 2020

These are used for live video encoding and editing mostly for broadcast.

I don’t think they’ve been in wide use outside of the sports broadcast industry but I might be wrong.

The main component that is used isn’t the actual CPU but the IGPs Intel has one of the best video encoder cores out there right now.

dmitrygr · on May 12, 2020

It has power input and ethernet ports so one can ssh into it, so does it at all need the host on the PCIe bus?

So, can I just plug power and ethernet into it and boot linux?

rbanffy · on May 12, 2020

It may need some support from the host if it doesn't have any local persistent storage.

dmitrygr · on May 12, 2020

netboot + tmpfs root would be good enough for me

rbanffy · on May 13, 2020

Judging by the prices Xeon Phi coprocessors fetch on eBay, I'm not really optimistic about finding these floating around.