Hacker News new | past | comments | ask | show | jobs | submit login

Why does LLVM generate "ud2" instructions? WTF?



Because it can. This is undefined behavior we are talking about.


Sorry, I don't get it. The Intel manual says on UD2:

"Raises an invalid opcode exception in all operating modes."

What is here undefined? LLVM must not generate such instructions except it it really wants such a exception. (Like Linux's panic() does on x86)


The original piece of C code has undefined behaviour, meaning LLVM can generate anything it wants. It happens to generate ud2 instructions (because it's better to crash hard and fast) but it could just as well print "puppies puppies puppies" a million times.


> it's better to crash hard and fast

I'm not so sure about that. If you need to squeeze out a little more performance, code that is "technically undefined" can be more portable than dropping to ASM.

I think LLVM should emit a warning on code with undefined semantics and generate DWIM instructions instead of UD2s.


It really depends on your motives. If you will be needing to port to new platforms in the future, it's better to have a hard-and-fast crash now, so you can learn and avoid the undefined behavior now, rather than face a large bug backlog years down the line.

But you're correct that sometimes it can be expedient to exploit such technically undefined behavior. (I've committed this sin myself, most commonly in serializers/deserializers)


Thanks.


To make him or her aware of the issue? I'd rather have my program crash at a well identifiable point in the execution flow than start acting rogue for no obvious reason (in both debug and production environments).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: