The behavior of 'assert' is not an anomaly. It comes from 'design by contract.' Assert is primarily meant to be documentation of constraints in code and secondarily a way of catching errors during development.
"Contract conditions should never be violated during execution of a bug-free program. Contracts are therefore typically only checked in debug mode during software development. Later at release, the contract checks are disabled to maximize performance." - https://en.wikipedia.org/wiki/Design_by_contract
That is certainly one approach, and the article agrees.
> The root cause of this weakness is that the assert mechanism is designed purely for testing purposes, as is done in C++.
However, C and C++ are perhaps unique in how much undefined behavior is possible and in how simple it is to create. Inserting into a vector while iterating through it, for instance. Or an uninitialized pointer.
That's why many C++ experts believe in runtime assertions in production. Crashing the application with a core dump is generally preferable to trashing memory, corrupting your database, or launching the missiles.
All of the C/C++ experts I know, as well as people who have interviewed me coming from primarily that background, have always been among the most adamant to stress that an application crashing unexpectedly should never happen and is always the wrong outcome.
I imagine they would say that your statement about crashing vs. e.g. launching the missiles is a false dilemma. You don't crash and you don't incorrectly launch the missiles.
I'm not a C++ developer so I can't say it with certainty. I more agree with what you're saying. I'm just relaying that my experience has been that out of many different language communities, C++ actually seems adamantly the opposite of what you're describing.
I think the industry is on the cusp of settling on the Erlang model which is essentially allowing pieces of a program to crash so that the whole program doesn't have to. It will take time for practices and tools to spread.
I have occasionally needed to argue with a long-time C dev that crashing is exactly what I want my program to do if the user gives unexpected input. They're used to core dumps instead of pleasant tracebacks.
I'm a big fan of fatal errors and crashing the program with a stack trace:
1. Stack trace at point of a contract violation tends to capture the most relevant context for debugging -- the faster it is to discover and debug an issue the easier it is to fix
2. Interacting code has to become sufficiently coupled to preserve "sane program state" -- an exception may or may not be recoverable -- a fatal error never is and there's no point in building code to try to recover. If the programmerer has to design the interaction among program components to avoid fatal errors then there must be fewer total states in the program vs a program which recovers from errors -- this makes the program easier to reason about.
3. On delivering good User experience -- id rather have clear and obvious crashes which are more likely to include the most relevant debug information -- than delivering the user some kind of non-crash but non-working, behavior (with possibly unknown security consequences) which may take longer to get noticed and fixed as a result of an error handling mechanism that deliberately _tries_ to paper over programming problems ...
I've actually modified third party libraries I've used to remove catch blocks or replace error handling within with fatal errors -- when dealing with unknown code it really can vastly speed up the learning process and the understanding based on observational behavior ... -- especially in understanding the behavior around edge cases.
In my experience, "You don't crash" means you catch the exception and exit gracefully, reporting a fatal error has occurred. Users don't distinguish between a crash and a fatal error.
Higher level languages are better at reporting uncaught runtime errors than C/C++ is, because they'll automatically do things like print useful stack traces and then exit gracefully even if you don't catch an exception. The interpreter doesn't crash when your code does.
I think you misunderstand the use case. If your container is tracking work done and it thinks 20 requests were handled and only 10 were received, you have an invariant failure. Without more context, this could easily be trashed memory, in which case, you might already be in the middle of undefined behavior. In that case, getting the hell out of the process is the most responsible course of action. Efforts to even log what happened could be counterproductive. You might log inaccurate info or write garbage to the DB.
Also, if you don't catch an exception in C++, most systems will give you a full core, which includes a stack trace for all running threads. Catching an exception and 'exiting cleanly' actually loses that information.
I've been writing software in C and C++ for a long time. Crashing is never a good user experience, so avoid it. If something unexpected happens, catch and report the error, then carry on, if possible, or gracefully exit otherwise.
As someone who works in support, customers (at least the ones I support) REALLY need a clear and obvious crash to understand something's wrong. That really forces them to "do something differently" and/or look for help. You're correct in that it's not a good user experience. Neither is chaos.
Exactly. "Undefined behavior" includes showing private data to the wrong user and booking ten times more orders than the user originally indicated. I'll take crashing over that.
It really depends on the type of software. Sometimes if something unexpected happens and you just catch and report an error, then you may end up in a state which will result in further errors a few hours later. It is much easier to track the root cause, if the program crashes immediately, than having to analyze several hours of logs. And then crashing (i.e. quitting with a core dump) is as graceful as it can be, since it provides all the necessary information to analyze the problem right when it happened.
I think it depends on the job the program is being used for. If the program is used in a setting where occasional crash has no severe consequences, say search engine backend, it may be useful for the company to actually run in production a program that is allowed to crash whenever severe error condition occurs. In scenarios where lives or lots of money hinge on program being alive and functioning, such as plane autopilot, or rocket / space probe control, crash or fatal error of the main program often means disaster and thus should never occur. If error condition occurs during execution, the program should withstand that and continue with default path of execution. In the past lots of lives and money could be saved if only the software and hardware conformed to this paradigm.
I agree with 'it depends' except for the case of safety critical systems, which I actually have experience in. A proper safety critical system should also be able to withstand a crash in an arbitrary process. The thing to remember is that there is no default path of execution if there is undefined behavior in C or C++ code. The process may do the worst thing it can, at least with the OS-level permissions it has.
It's not an anomaly, but it can be a surprise for people who don't understand what it does. For example, they use asserts for validation and then the validation doesn't work in production. It's absolutely right the way it works, but it's still a gotcha for the audience this blog post is aimed at.
The point is that they think they understand it, because most of Python behaves the same in development and production. You see this function called 'assert', it gives the right error at the right time, and all is good. Then you push it to production and it stops throwing errors. Eventually, you read the manual and it tells you that this specific function is ignored in production. This is a surprise because, say, print doesn't behave like that.
That only happens if you have different production and development environment settings though -- in which case you should expect the different results.
In this particular case, you compiled your code with "-O", so it's not the "same code" used in production, but code compiled with a different flag. Shouldn't they check what the flag does?
The developer may be deploying code to a server they didn't configure.
I agree with the way it's done in Python, as it's consistent with most other languages. But the blog post is right to point out to inexperienced developers that the way assert behaves might give them a surprise.
Unless the article was updated after your comment: the reason is right there in the article:
"However, Python does not produce any instructions for assert statements when compiling source code into optimized byte code (e.g. python -O). That silently removes whatever protection against malformed data that the programmer wired into their code leaving the application open to attacks.
The root cause of this weakness is that the assert mechanism is designed purely for testing purposes, as is done in C++. Programmers must use other means for ensuring data consistency."
I would argue that one should never use '-O' it also strips doc strings from running code. Not really an `optimization` but they had do something right? One couldn't run __unoptimized__ in production could they?
Couldn't the assert message just say something like ", warning: this is only checked in development". I don't know, requirement of knowing how something works are always kind of tough since a lot of people's first interactions with things are in the code (like if they've just joined a new project), and they may assume they understand the functionality, and their assumptions may initially seem correct as they test it themselves. Its one of those "don't know what you don't know" scenarios, and "look up every function you ever see just in case what you think it does isn't what it actually does!" can be a bit impractical. So if this is known to be a gotcha, making the function itself speak that gotcha might be useful.
> Assert is primarily meant to be documentation of constraints in code
Real or imagined constraints? AFAICT, an assert only tells me what you wish your program did, but that has absolutely no bearing on what it will actually do.
>AFAICT, an assert only tells me what you wish your program did, but that has absolutely no bearing on what it will actually do.
Depending on the implementation, an assert can either merely log or absolutely stop a program that doesn't pass its test, so it very much has a bearing on what the program will actually do.
Then imagined it is. Your desires are totally a part of your imagination, unless you make them become real.
> Depending on the implementation, an assert can either merely log or absolutely stop a program that doesn't pass its test, so it very much has a bearing on what the program will actually do.
Point taken. Unfortunately, logging errors or aborting the program won't make assertions magically become true, though.
Only in the most pedantic and useless sense of the term.
Asserts are not just some random imagination, they are added based on the program's specifications and expected/desired functionality and constraints. The CS term for those kind of constraints are "invariants", and asserts are a way to be notified if those invariants are violated.
>unless you make them become real.
Only there are no assurances for that. If the invariants in your program were somehow guaranteed to be "real" then you wouldn't need asserts.
Asserts are there because whether you tried to make your invariants "become real" or not, you'll still miss things, have bugs, have unexpected interactions with code/systems outside your control etc. So they are there to tell you about those misses.
>Point taken. Unfortunately, logging errors or aborting the program won't make assertions magically become true, though.
Assertions are not expected to "magically become true" -- just to (a) inform about anytime they are violated, and, optionally, (b) not be violated and still have the program continue to run.
> The CS term for those kind of constraints are "invariants", and asserts are a way to be notified if those invariants are violated.
An “invariant” is a function of the process state whose value remains constant (hence “invariant”) in spite of changes to the process state. Perhaps you meant “precondition” or “postcondition”?
> Only there are no assurances for that. If the invariants in your program were somehow guaranteed to be "real"
Guaranteeing that preconditions, postconditions and invariants hold when they're supposed to hold is your job, not the computer's.
> then you wouldn't need asserts.
I absolutely don't need asserts. An assert merely describes what you want, but that's useless to me, unless you establish a relation between what you want and what your program actually does - with proof.
> Asserts are there because whether you tried to make your invariants "become real" or not, you'll still miss things, have bugs,
It will become patently clear when the proof doesn't go through.
> have unexpected interactions with code/systems outside your control etc.
What happened to sanitizing input at system boundaries?
> Assertions are not expected to "magically become true"
Of course. Assertions are expected to always be true.
>An “invariant” is a function of the process state whose value remains constant (hence “invariant”) in spite of changes to the process state. Perhaps you meant “precondition” or “postcondition”?
No, I meant invariant. An invariant is something that is supposed to hold true, not just things that are guaranteed to hold true (e.g. a constant that can't ever change anyway). That's why the need for assertions to check that invariants hold.
From Wikipedia:
"In computer science, an invariant is a condition that can be relied upon to be true during execution of a program, or during some portion of it. It is a logical assertion that is held to always be true during a certain phase of execution. (...) Programmers often use assertions in their code to make invariants explicit."
Preconditions and postconditions are similar in concept, but are supposed/wanted to hold true before (pre) or after (post) a method runs.
>I absolutely don't need asserts. An assert merely describes what you want, but that's useless to me, unless you establish a relation between what you want and what your program actually does - with proof.
Well, asserts weren't created specifically for you. Feel free not to use them.
They are useful to me, and assuming from their widespread use, others, even if they don't formally prove the program does 100% that it needs to (which nobody expected them to anyway).
Until we all program in Coq or similar, they will be useful for all kinds of checks. A correct program is a spectrum, not a binary option.
>Of course. Assertions are expected to always be true.
No, they are also expected to be false -- that's why we add assertion statements to check whether our assertions hold. But we're splitting hairs twice or three times here.
> That's why the need for assertions to check that invariants hold.
No, you need proof.
> Until we all program in Coq or similar
So you're saying humans are fundamentally incapable of establishing the logical validity of what they assert by themselves? This contradicts historical evidence that people have done this for well over 2 millennia, using various methods and tools.
> A correct program is a spectrum, not a binary option.
Some errors might be easier to fix or have less disastrous consequences than others, but a correct program is one that has no errors, so I don't see where the spectrum is.
>That's why the need for assertions to check that invariants hold.
>No, you need proof.
You might need proof, but it doesn't mean you'll get it. In most languages in common use (e.g. not Coq and co) and for any larger than trivial program "proof" is impossible.
So, we'll continue to need all the tools we can realistically use, including assertions, unit tests and others.
>So you're saying humans are fundamentally incapable of establishing the logical validity of what they assert by themselves? This contradicts historical evidence that people have done this for well over 2 millennia, using various methods and tools.
This particular question is not even wrong in the context of the discussion. I don't usually throw around the term "troll", but you're either trolling or being alternatively naive on principle / too pedantic.
I any case, whether people are "capable of establishing the logical validity of what they assert by themselves" for trivial things or for narrow domains, the absolutely have not been able to manually do it, or do it fast enough to be practical, for software programs, especially any non trivial one. Even the best programmers introduce bugs and have behavior in their program that they didn't expect.
Which is also why even the best programmers use assertions. It's not some obscure feature relegated to newbies or bad programmers. It's a standard practice, even in the most demanding and hardcore programming environments, from the Linux kernel (which uses the BUG_ON assertion macro) to NASA rocket code.
Or I could turn "troll mode" on an answer on the same vein as the question: if "people have done this for well over 2 millennia, using various methods and tools" then they haven't been doing it "by themselves" any more so than when using assertions (which is also one of such "tools").
And of course, I haven't anywhere stated that "humans are fundamentally incapable of establishing the logical validity of what they assert by themselves".
The gist of my comment would be merely that humans are bad at establishing the logical validity of their computer programs by themselves -- for which there is ample "historical evidence".
>Some errors might be easier to fix or have less disastrous consequences than others, but a correct program is one that has no errors, so I don't see where the spectrum is.
The spectrum is obviously in that correctness is not black and white, and all non trivial programs have bugs in practice. Those programs with few and far between bugs are more correct than others.
> If you have the process quit it definitely stops them from being false though.
The assertion remains false for the final process state, before the process quits. Outside of the process, the assertion is simply meaningless (neither true nor false), because the assertion's free variables are only bound inside the process.
>The assertion remains false for the final process state, before the process quits.
Which is inconsequential. Programmers don't expect automatic "recovery" from assertions, they expect them to notify them of the violated constraint, and/or to ensure that a program wont go on and use a value that violates an assertion further down -- which program termination achieves.
>Outside of the process, the assertion is simply meaningless (neither true nor false), because the assertion's free variables are only bound inside the process.
> He meant from being false subsequently in the program.
But, you see, the assertion is no less false just because the process was aborted. The fact remains that there exists a reachable state for which the assert fails. So apparently what I meant is no more obvious to you than it was for JonnieCache.
>But, you see, the assertion is no less false just because the process was aborted. The fact remains that there exists a reachable state for which the assert fails
Yes, Captain Obvious, and that reachable state is exactly what every programmer who uses an assert() statement expects when he writes it.
If there wasn't the potential for such a state, assert statements would do nothing ever in the first place -- so it would be kinda silly to even have them in.
"Contract conditions should never be violated during execution of a bug-free program. Contracts are therefore typically only checked in debug mode during software development. Later at release, the contract checks are disabled to maximize performance." - https://en.wikipedia.org/wiki/Design_by_contract