My Little LLVM: Undefined Behavior is Magic!

综合技术 2016-04-01

New LLVM logo

There’s been lots of discussion online ( and then quite some more ) about compilers abusing undefined behavior. As a response the LLVM compiler infrastructure is rebranding and adopting a motto to make undefined behavior friendlier and less prone to corruption .

The re-branding puts to rest a long-standing issue with LLVM’s “dragon” logo actually being a wyvern with an upside-down head , a special form of undefined behavior in its own right. The logo is now clearly a pegasus pony.

Another great side-effect of this rebranding is increased security by auto-magically closing all vulnerabilities used by the hacker who goes by the pseudonym “ Pinkie Pie ”.

These new features are enabled with the -rainbow clang option, in honor of Rainbow Dash’s unary name.

A Few Examples

C++’s memory model specifies that data races are undefined behavior. It is well established that no sane compiler would optimize atomics , LLVM will therefore supplement the Standard’s happens-before relationship with an LLVM-specific happens-to-work relationship . On most architectures this will be implemented with micro-pause primitives such as x86’s rep rep rep nop instruction.

Shifts by bit-width or larger will now return a normally-distributed random number. This also obsoletes rand() and std::random_shuffle .

bool now obeys the rules of truthiness to avoid that annoying “but what if it’s not zero or one?” interview question. Further, incrementing a bool with ++ now does the right thing.

Atomic integer arithmetic is already specified to be two’s complement. Regular arithmetic will therefore now also be atomic. Except when volatile , but not when volatile atomic.

NaNs will now compare equal, subnormals are free to self-classify as normal / zero / other, negative zero simply won’t be a thing, IEEE-754 has been upgraded to PONY-754, floats will still round with style , and generating a signaling NaN is now guaranteed to not be quiet by being equivalent to putchar('a') . While we’re at it none of math.h will set errno anymore. This has nothing to do with undefined behavior but seriously, errno ?

Type-punning isn’t a thing anymore. We’re renaming it to type-pony-ing, but it doesn’t do anything surprising besides throw parties. AND WHO DOESN’T LIKE PARTIES‽ EVEN SECURITY PEOPLE DO! :tada:

A Word From Our Sponsors

The sanitizers—especially undefined behavior sanitizer , address sanitizer and thread sanitizer —are great tools when dealing with undefined behavior. Use them on your tests, combine them with fuzzers , try them as cupcake topping! Be warned: their runtimes aren’t designed to be secure and you shouldn’t ship them in production code!

Cutie Marks

To address the horse in the room: we’ve left the new LLVM logo’s cutie mark as implementation-defined . Different instances of the logo can use their own cutie mark to illustrate their proclivities, but must clearly document them.

Posted by JF Bastien and Michael Spencer .

LLVM Project Blog

责编内容by:LLVM Project Blog (源链)。感谢您的支持!


Lessons to learn from the CLang/LLVM codebase It’s proven that Clang is a mature compiler For C and C++ as GCC and Microsof...
软件的变革与 AOT 前言 AOT 即 Ahead of Time Compilation,即运行前编,与之对应的是 JIT。众所周知,程序的源码并不能够被处理器直接执行, 编程...
LLVM on Windows now supports PDB Debug Info For several years, we’ve been hard at work on making clang a world class to...
Simplexhc – Haskell to LLVM compiler (design phase... Simplexhc - a STG to LLVM compiler Link to Github repo . I’m tryi...
LLVM学习笔记(7) 2.2.6. 调度信息 在Instruction定义430行的Itinerary以及433行的SchedRW用于描述指令调度的信息。 其中Itiner...