The secret life of NaN

scarmig · on April 5, 2018

Javascript doesn't have the same richness of NaN as other language specs, FWIW. [0]

> the 9007199254740990 (that is, 253-2) distinct “Not-a-Number” values of the IEEE Standard are represented in ECMAScript as a single special NaN value. (Note that the NaN value is produced by the program expression NaN.) In some implementations, external code might be able to detect a difference between various Not-a-Number values, but such behaviour is implementation-dependent; to ECMAScript code, all NaN values are indistinguishable from each other.

> The bit pattern that might be observed in an ArrayBuffer (see 24.1) or a SharedArrayBuffer (see 24.2) after a Number value has been stored into it is not necessarily the same as the internal representation of that Number value used by the ECMAScript implementation.

[0] https://www.ecma-international.org/ecma-262/8.0/index.html#s...

stcredzero · on April 5, 2018

Could this be used to hide data in Javascript programs?

mikekchar · on April 6, 2018

As per TFA, some implementations use NaN-boxing to represent other values (all values are encoded into NaN). In those implementations, you will have pretty strange behaviour. V8 doesn't use NaN-boxing so it will work there.

Of course NaN-boxing is the reason that the ECMA spec specifies that all NaNs are treated equally. If you are encoding all of your values in NaN, you still have to have one NaN value that is actually NaN. The spec doesn't specify which one that must be.

BTW, I highly recommend reading TFA and the links within, because it's really fascinating (the "tagged-pointers" used in V8 and other languages like Guile are also really interesting).

scarmig · on April 5, 2018

You'd be relying on undefined behavior by the ECMAScript spec, but sure. Whether it works is implementation dependent, and you'd have no grounds for complaint if it changed at any time.

Playing around with it in Chrome 65.0.*...

  const buffer = new ArrayBuffer(12);
  const uint8s = new Uint8Array(buffer);
  const floats = new Float32Array(buffer);

  floats[0] = NaN, floats[1] = NaN; //127.192.0.0, 127.192.0.0, 0.0.0.0
  uint8s[0] = 104, uint8s[1] = 105; //127.192.104.105, 127.192.0.0, 0.0.0.0
  floats[2] = floats[0] + floats[1]; //127.192.104.105, 127.192.0.0, 127.192.104.105

  console.log(String.fromCharCode(uint8s[8]) + String.fromCharCode(uint8s[9]));

IshKebab · on April 5, 2018

Unlikely because those bits are use for type tagging.

kodablah · on April 5, 2018

Careful, some engines deal w/ NaN bits differently. E.g. the JVM doesn't guarantee non-NaN bits won't move (I had trouble with this at one point [0]). For specs like WebAssembly (and their conformance tests), they expect the backends preserve the bits during some reinterpret ops[1] IIRC.

0 - https://stackoverflow.com/questions/43129365/javas-math-rint... 1 - http://webassembly.org/docs/rationale/#nan-bit-pattern-nonde...

rdtsc · on April 5, 2018

> the double-precision format has 52 bits for the mantissa, which means there are 51 bits available for the payload.

I have seen data passed through the NaN payload in C in a signal processing application. It was both vile and genius at the same time. Vile because of how hackish it was, genius because it avoided a larger redesign of the application.

hermitdev · on April 6, 2018

First thought I had when reading about this is that the unused bits would be perfect for potentially reporting a cause of the NaN. A bit hackish, sure, but potentially useful, especially in C where you don't have exceptions that can carry human readable explanations of what wrong to arrive at a NaN.

laythea · on April 5, 2018

Incidentally, I found a cool way of checking whether NaN occurred:

Just do:

void some_function(float number) { if(number != number) number was passed in as <NaN> }

Took me a little bit to get my head around it but if the number is not equal to the number, then a problem occurred. I used this to stop camera code (driven by floats) from crashing when receiving non-sense input coordinates. It still doesn't work properly, but it doesn't crash now either! (at least not for that :)

jcoffland · on April 5, 2018

You should use isnan() or isfinite(). It will make your code more readable.

laythea · on April 5, 2018

agreed but that's why we also have // comment. No need to get fancy :)

ska · on April 5, 2018

commenting is not a good replacement for cleaner code, because they can drift and lie.

crysin · on April 5, 2018

Whenever the whole comments argument comes up I like to bring up this article[0]. I find that it describes a good balance between when comments are useful and when they are not.

[0]: https://blog.codinghorror.com/code-tells-you-how-comments-te...

hermitdev · on April 6, 2018

Exactly.

/begin rant I'm working on a C++ code base developed by contractors that were lazy and doing things the expedient way instead of the correct way (thousands of circular dependencies between libs - and even apps - apps depending on source from other apps), multiple copies basic functionality with slight changes (largely bug fixed in one place, but they forgot the other X places it was copied).

I've gone so far as to write a "cleanup" script in python that runs a number of transformations on the source. Fixing things like inconsistent line-endings (via dos2unix), inconsistent formatting (via clang-formatting), limited conversions of certain Boost uses to std C++, a few pervasive spelling errors (contracts were eastern European, non-native English speakers). Another thing I'm considering is removing all comments. Lots of dead, commented out, code. Also, invalid UTF-8 characters in comments, from an ANSI code page I've not been able to identify (which of course creates problems with the python script treating the source as UTF-8).

The comments that are currently present that I've seen are: 1. stupid/moronic/redundant (like merely marking constructors as ctor and destructors as dctor with no more info - like I can't just figure that out by reading the source to begin with) 2. Just plain wrong. As-in, comment no longer matches the code. 3. In a foreign language (Ukrainian/Russian), in an unknown code-page, so also worthless at face value (yeah, I could lookup the code pages for Ukraine and Russia have give it a shot and run the result through a translator), but it's likely the result will also fall under #2. 4. Unnecessary demarcation between functions. e.g. "// -------" out to ~80 characters wide between functions with no other white-space. Want things to be clearer? Don't use K&R notation and but open & close braces on the same indentation level (e.g. don't put the opening brace at the end of your loop/function/if/else/other statement, but put them alone on the next line). 5. Dead code. Don't commit commented-out code. Delete it. That's what version control is for. Want to know what it used to do? Look at the history, not the chain of commented-out code. Commenting out code is fine for quick testing. But don't commit that. People that follow you will see that and wonder "why is this here? is this significant?" 6. Random BOM (byte-order-marks) for UTF-8 source that is actually entirely ASCII. A lot of linux tools don't expect or handle BOM in UTF-8 sources (I'm looking at you, psql).

In this particular project I'm working on at my employer, I don't think I've seen a single useful comment in the source. Sadly, the Jira tasks filed by the now-fired contractors are also nearly universally useless. All headline and mostly zero description on what the problem is, usually no steps to reproduce.

/rant

Sorry for the rant, when I started writing this, I did not intend it to be so. But, as I wrote, I started remembering more and more things that have been driving me nuts.

mikekchar · on April 6, 2018

In one project I inherited, some contractors had been maintaining the code for about a year. Every change they made they added a comment box around it like "Mike made this change on Dec 31". What's worse was they included a copyright indication (I tried to find out if anyone had gotten a copyright assignment from them and the legal department literally slammed the door in my face). Their changes were bizarre and included a lot of things you ranted about. I reverted the code to the state before they arrived and sent the result to QA. "Wow! How did you fix so many bugs so quickly?" was the response from QA.

We had an iron clad contract with that contracting firm, but after I started reverting every single subsequent change they made, they stopped sending changes ;-) Luckily my boss was a director and could pave the political fallout from my rather brash (but justified) actions.

adrianN · on April 6, 2018

You should check out clang-tidy.

laythea · on April 5, 2018

cleaner code is not a good replacement for cleaner comments either though.

ska · on April 5, 2018

When you need comments, they should be clear.

But when you can make the code clear enough to need no commenting, you should do that instead.

hermitdev · on April 6, 2018

Comments need to explain why they are doing something, not what the code is doing. I can tell what the code is doing by reading the source. Otherwise, I agree. The code should be clear enough that no comments are needed.

wglb · on April 5, 2018

In "Refactoring", Martin Fowler says that "comments are bugs".

anarazel · on April 6, 2018

The amount of damage Martin Fowler has done with his writing...

wglb · on April 6, 2018

What in particular has caused damage?

khedoros1 · on April 7, 2018

Well, good code obviates the need for most comments. A comment shouldn't cover up a bug of clarity in a piece of code, in almost any case imaginable (basically, if it isn't a Fast InvSqrt() level of utility, the dev ought to reconsider the "cleverness" of the code). And anyhow, the implementation details of the code are liable to change; the comment becomes outdated, and there's a good chance that it won't be updated.

What about a comment that notes what part of a spec some code implements (i.e. something outside the actual behavior of the code)? A comment that answers "why" can be helpful, and can sometimes be worth the high cost that an unchecked, unexecuted part of the program inherently carries.

If Fowler's claiming that all comments are bugs, I'd call that damaging.

dalore · on April 5, 2018

The code is the comment if it's written well.

ridiculous_fish · on April 6, 2018

Some clang code:

  // ISO/IEC TR 18037 S5.3 (amending C99 6.7.3): "A function type shall not be
  // qualified by an address-space qualifier."
  if (Type->isFunctionType()) {
    S.Diag(Attr.getLoc(), diag::err_attribute_address_function_type);
    Attr.setInvalid();
    return;
  }

The comment justifies the code in a way that the code itself never could.

mikekchar · on April 6, 2018

A unit test could document that equally well (although the pointer to the spec would still be useful). Whether or not it would fit your style of programming is another matter, of course.

VMG · on April 6, 2018

A unit test would almost certainly explain the behavior in a completely separate place. That's not documentation.

dalore · on April 6, 2018

  if (isTypeFunction) {
    showErrAttrDialog(Attr);
    setInvalid(Attr);
  }

khedoros1 · on April 7, 2018

Maybe I'm missing something. The code that you've written doesn't look any clearer to me than the code from the compiler itself, and without the comment, there's no "why" for the behavior.

Later, if a bug is filed saying that the compiler isn't compliant with the requirements of "ISO/IEC TR 18037 S5.3", how can you be sure to find the code where the behavior is implemented?

dalore · on April 8, 2018

I would put that comment in the function header comment, not in the code itself. That's more meta.

And of course the function in question only deals with "ISO/IEC TR 18037 S5.3" so it's easily tested.

If someone files a bug saying the compiler isn't compliant with the requirements of "ISO/IEC TR 18037 S5.3", then they would provide a test case showing not compatible. Add that case to the existing unit tests and you will see which function fails. No need for searching the code to see where the behaviour is implemented. With clean code it's obvious, the test will show this method to be at fault. Even without any comments.

Code in functions should also try and stay at the same level of abstraction, moving low level stuff to abstracted methods that describe the intention (And then the low level method does the how). That way your code reads like a story.

khedoros1 · on April 9, 2018

> No need for searching the code to see where the behaviour is implemented. With clean code it's obvious, the test will show this method to be at fault. Even without any comments.

So...it sounds like you'd have one broken implementation of the code, and one working implementation, that might just cancel out the effects of the broken one. You're assuming code that has been cleanly written for its whole history, or a lot of love poured into it to develop quality tests for each requirement. How commonly does that actually happen?

khedoros1 · on April 9, 2018

That wasn't the most even-keeled response, and it's past the edit window. What I mean to say is that I've never seen anything but a small codebase that couldn't use some in-code explication.

Clear code provides a clear "how". Good test coverage can act as documentation of the proper behavior and help prevent regressions. But I don't see how it follows that clean code makes the location of each implemented feature obvious. It doesn't seem like it would be inherently true.

wolco · on April 5, 2018

But cleaner code can remove the need for a comment.

Retra · on April 6, 2018

But isnan is the least fancy way to check if something is NAN. What you're doing is fancy. I mean, that's exactly why you said "it took me a little while to get my head around it."

aurelian15 · on April 5, 2018

Be careful though when compiling your code with -ffinite-math-only (which is part of -ffast-math and -Ofast). The compiler will likely replace your function with a constant expression returning false, since it assumes that NaNs cannot occur in your code. I found that the only safe way to actually check for NaNs when using these flags is to memcpy the float into a byte array (to avoid aliasing issues) and manually compare against the NaN bitmask.

akrasuski1 · on April 5, 2018

I wouldn't be surprised if the future compilers would optimize it away as well, since ffast-math basicly makes NaN's equivalent to undefined behaviour.

pwinnski · on April 5, 2018

$deity preserve me from "cool ways" of doing things, and lead me into understand of isnan().

dvh · on April 5, 2018

I thought JavaScript has isNaN function

jcoffland · on April 5, 2018

That's not JavaScript.

ranit · on April 5, 2018

JavaScript does have isNaN() function: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

idbehold · on April 5, 2018

Better to use Number.isNaN(): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

cryptonector · on April 5, 2018

The x86_64 address-space size is currently 49 bits. There are two 48-bit halves. It could get bigger eventually. So NaN-coding (or Nan boxing, or whatever you want to call it), is risky.

I remember Solaris had a problem because it would put anonymous mmap()s in one 48-bit range and the heap and stacks in the other range, and this broke some ECMAScript implementation that used NaN-coding with the assumption that all data would be in the top 48-bit address space.

jwilk · on April 5, 2018

No, there are two 47-bit halves. (But see my other comment about 57-bit address space.)

cryptonector · on April 6, 2018

Ay, sry.

apaprocki · on April 6, 2018

I had this issue on both Solaris/SPARC and AIX/POWER because both use the full 64-bit address space. It required jumping through hoops to make sure the custom allocator could both allocate 1MB aligned pages that were also < 48 bits so that it would play nice with SpiderMonkey.

akira2501 · on April 5, 2018

> There are two 48-bit halves. It could get bigger eventually.

Which is precisely why the implemented in that way, to prevent "clever" uses like this from eventually breaking on chipsets that define more address bits than the early default.

kccqzy · on April 6, 2018

I believe the address space size has recently been expanded through 5-level paging. This is now supported in Linux since commit b1b6f83ac938d176742c85757960dec2cf10e468.

rwmj · on April 5, 2018

This is timely. There's some problem/difference between RISC-V and other IEEE implementations around propagating NaNs. It causes issues with numpy and R.

https://groups.google.com/a/groups.riscv.org/forum/#!topic/i...

wolfgke · on April 5, 2018

"for instance, the C standard defines and requires the SIGFPE signal to represent floating point computational exceptions.".

The C standard does not define and/or require this. What you link and refer to is the POSIX standard.

keithwinstein · on April 5, 2018

Not quite -- the C standard does define SIGFPE, and sort of requires it. C99 says "C does require that SIGFPE be the signal corresponding to arithmetic exceptions, if there is any signal raised for them." See clause 7.14 ("Signal handling") and H.3.1.2 ("Traps").

zeveb · on April 5, 2018

> Double precision NaNs come with a payload of 51 bits which can be used for whatever you want — one especially fun hack is using the payload to represent all other non-floating point values and their types at runtime in dynamically typed languages.

That is genius! It'd be a little weird using a system where the most positive fixnum is 51 bits, but it wouldn't be terrible — and of course that's why bignums exist. And 50 or 49 bits are certainly more than large enough for realistic RAM sizes for quite awhile — the latter is roughly half a petabyte.

It seems from the article that it's a pretty common technique; I'm surprised that I've not heard of it before.

nixpulvis · on April 5, 2018

NaN is one of the reasons IEEE floating point numbers don't have a fully defined order.

hermitdev · on April 6, 2018

Sort of like ANSI NULL in databases. They're purposefully not supposed to equate (or not equate) to anything, even theirselves.

undecidabot · on April 5, 2018

A little "fun" fact about NaN: they are never equal to each other, even if it's the exact same value and variable.

How can you check if a variable is NaN then? Well, if it's not equal to itself then it must be NaN!

jwilk · on April 5, 2018

> 48 bit pointers (the current x86-64 pointer bit-width)

For completeness: Linux supports 57-bit address space on x86-64: https://lwn.net/Articles/117749/

I don't know if there's any real hardware that supports this, though.

jwilk · on April 5, 2018

Oops, wrong link. I meant this: https://lwn.net/Articles/717293/

baybal2 · on April 5, 2018

My first encounter with Javascript: Not a Number is a Number.

In my youth, I spent days wondering how in the world numbers in animation scripts are not "NaN" strings the moment something causes zero division, but are still numbers called NaN.

scrupulusalbion · on April 5, 2018

Can arithmetic be applyied to NaN a la i? (e.g. 5 + 2 * NaN = 5 + 2NaN)

mort96 · on April 5, 2018

NaN is sort of viral. Any math operator (afaik) returns NaN if any of the operands is NaN. That means `5 + (2 * NaN)` becomes `5 + NaN`, which just becomes `NaN`.

Still, `5 + 2 * NaN == 5 + 2 * NaN` would return false, because it becomes `NaN == NaN`, and NaN is not equal to NaN.

logicallee · on April 6, 2018

what about 0 * instead of 2 * ? On one hand by the viral property it would seem to also be unequal? On the other hand having zero NaN is something every equation already has :D it seems fair to silently remove a 0 * NaN term.

mturmon · on April 6, 2018

It's not fair to remove 0 * NaN.

NaN will come up when doing

  Inf - Inf

But clearly:

  0 * (Inf - Inf)

cannot be said to be zero.

NaN really is not a number.

hermitdev · on April 6, 2018

IIRC, but I don't have the IEEE spec in front of me, any operation, binary or unary, that has an operand of NaN shall have the result of NaN.

Retra · on April 6, 2018

"0 * NaN" makes as much sense as "0 * Blue". It's NaN because you're trying to do something that is fundamentally not multiplication.

logicallee · on April 6, 2018

thanks - makes sense and is a clear explanation.

skybrian · on April 5, 2018

Go is sort of the opposite of this. If you do something like:

func sum(a, b float64) interface{} { return a+b }

It will allocate memory for the float, because an interface type always contains two pointers. [1]

It's pretty crazy that the second word in a Go interface has to be a pointer. But I suppose if dynamic types were used more, they would be optimized more.

[1] https://www.darkcoding.net/software/go-the-price-of-interfac...

zaarn · on April 6, 2018

I think it's more about not having exceptions. If the second word is always a pointer then you can make code that blindly assumes such and optimize for that.

If it may not be a pointer you have to branch, which can be add some overhead to the lucky path (every pointer deref will have to wait for type comparison in machine code)

If the float has just been written, the data should be in cache so the performance penalty is probably not too bad in most cases.

skybrian · on April 6, 2018

Well, the problem is when you have a []interface{} that contains a lot of float64 mixed with other things. If the size of the slice is large, I don't think you can assume it's all cached.

zaarn · on April 9, 2018

If the slice is large, it's less likely to be cached either way.

skybrian · on April 10, 2018

True, but there is also locality of reference (having everything in a contiguous block uses less cache than pointer-chasing) and garbage collection pressure.

In principle, using NaN boxing or a similar technique to store non-floats, you should be able to store floats in an []interface{} as efficiently as in a []float64. Scripting languages do this, but not Go.

al2o3cr · on April 6, 2018

Another similar NaN-boxing approach, available as an option in mruby:

https://github.com/mruby/mruby/blob/d6cb4f9cf2027eb20f67238a...