> the big benefit is parallel computing at a massive scale
The problem with this line of reasoning is that, even though a quantum system might have many possible states, we only observe a single one of those states at the time of measurement. If you could somehow prepare a quantum system such that it encoded the N equally-likely solutions to your classical problem, you would still need to rerun that experiment (on average) N times to get the correct answer.
Broadly speaking, quantum computing exploits the fact that states are entangled (and therefore correlated). By tweaking the circuit, you can make it so that incorrect solutions interfere destructively while the correct solution interferes constructively, making it more likely that you will measure the correct answer. (Of course this is all probabilistic, hence the need for quantum error correction.) But developing quantum algorithms is easier said than done, and there's no reason to think a priori that all classical problems can be recast in this manner.
I think that the big challenge is to recast any classical computations as quantum computations with a superpolynomial speedup.
I think that all classical problems can be cast as quantum computations because quantum computation is just computation - I believe that one can implement a turning machine using quantum gates, so arbitrary computation is possible with quantum gates.
The superpolynomial speedups are the thing.. I wonder if these will be limited to a class of computations that have no physical realization - just pure maths.
From the New York Times: "How Polarized Politics Led South Korea to a Plunge Into Martial Law" [1]
> From the start [...] Mr. Yoon faced two obstacles.
> The opposition Democratic Party held on to its majority in the National Assembly and then expanded it in parliamentary elections in April, making him the first South Korean leader in decades to never have a majority in Parliament. And then there were his own dismal approval ratings.
> Mr. Yoon’s toxic relationship with opposition lawmakers — and their vehement efforts to oppose him at every turn — paralyzed his pro-business agenda for two years, hindering his efforts to cut corporate taxes, overhaul the national pension system and address housing prices.
and also
> Opposition leaders warned that Mr. Yoon was taking South Korea onto the path of “dictatorship.” In turn, members of Mr. Yoon’s party called the opposition “criminals,” and voters on the right rallied against what they called “pro-North Korean communists.”
> (Mr. Yoon echoed that language on Tuesday in his declaration of martial law, saying he was issuing it “to protect a free South Korea from the North Korean communist forces, eliminate shameless pro-North Korean and anti-state forces.”)
So basically, Mr. Yoon was unable to pass his agenda (as his party never had control of the legislative branch), and rather than continue to negotiate, he decided to impose martial law, label the opposition communists, and then ban the National Assembly from gathering (they gathered anyway).
Empirical Bayes is exactly what I was getting at. It's a pragmatic modelling choice, but it loses the theoretical guarantees about uncertainty quantification that pure Bayesianism gives us.
(Though if you have a reference for why empirical Bayes does give theoretical guarantees, I'll be happy to change my mind!)
Germany spends some 70 billion euros maintaining the road system, only about a third of which is offset by taxes on drivers [1]. If we accept that investments in roads reduce carbon emissions by 0 million tonnes per year, then that works out to NaN € per tonne -- much worse than other carbon abatement methods!
Naturally there might be other positive externalities to owning a car, but I don't own a car and therefore wouldn't be privy to them. Instead I rely almost exclusively on Germany's public transport for my daily commutes, which I find perfectly satisfactory for this purpose and significantly more convenient than parking and maintaining a car.
A third can't be right. 15 billion from petrol and 18.2 billion from diesel alone make up almost half of that. 48.76 million vehicles times €100 (back of the envelope calculation) for vehicle tax puts that number above half.
From the article (sorry for the bad link before; fixed below [1]):
> The revenue from taxes and levies on road traffic amounts to around 50 billion euros annually. Around half of this is earmarked by law via the mineral oil tax, i.e. around 25 billion euros. This means that just over a third (36%) of the earmarked revenue from road traffic covers the costs of roads and other facilities such as parking lots and the like. It is therefore clear that the public sector is heavily subsidizing road traffic.
My understanding is that these two forms of taxes add to more than 50%, but then almost half of those taxes must be reinvested elsewhere by law (i.e., not into roads), hence the 1/3 figure. But even if you ignore this reallocation of taxes, you still have a deficit of around 20 billion euros.
That's because the selection of numbers is a little bit weird. The 70 billion on the cost side are composed of 38 billion for construction and maintenance, 14 billion for traffic police and 18 billion for public funds spent for accidents. The generated income is only taxes on fuel and the tax car owners have to pay.
If you include the cost of the traffic police, there is way more stuff that you can include on the income side like taxes on car sales and part of the cost comes also back to the government in the form of taxes. There is likely also a large part of the costs that is missing. Doing this properly is a lot of work and doing it precisely is hard to impossible. These sort of things almost always include estimates for the higher order effects.
Btw: I googled the study[1] and apparently it was funded by the "Netzwerk Europäischer Eisenbahnen e.V." (Network of European Railways Association). I would take any statements and numbers with a huge grain of salt.
Maybe in UK local roads are funded differently? If you counted only national roads in Poland it would seem that Poland takes more in revenue than spends on roads, which isn't true if you count expenses on all the local roads that aren't in national budget.
Even if you don't drive a car yourself you depend a lot on things delivered by road. Most of the road wear is done by trucks bringing you food, construction materials and whatever else. You can't exactly replenish your local grocery store by rail or cargo bike.
Most of the need for roads though is cars. It is rare to see a multi-lane road where trucks are restricted to only 1 lane and cars allowed the rest - if though trucks wouldn't fill that single lane and in turn the other lanes could be built much cheaper. (in part because the road needed for a car isn't that much cheaper than the road needed for a truck - labor is about the same, and weather is often a large factor)
> JS minification is fairly mechanical and comparably simple, so the inversion should be relatively easy.
Just because a task is simple doesn't mean its inverse need be. Examples:
- multiplication / prime factorization
- deriving / integrating
- remembering the past / predicting the future
Code unobfuscation is clearly one of those difficult inverse problems, as it can be easily exacerbated by any of the following problems:
- bugs
- unused or irrelevant routines
- incorrect implementations that incidentally give the right results
In that sense, it would be fortunate if chatGPT could give decent results at unobfuscating code, as there is no a priori expectation that it should be able to do so. It's good that you've also checked chatGPT's code unobfuscation capabilities on a more difficult problem, but I think you've only discovered an upper limit. I wouldn't consider the example in the OP to be trivial.
Of course, it is not generalizable! In my experience though, most minifiers do only the following:
- Whitespace removal, which is trivially invertible.
- Comment removal, which we never expect to recover via unminification.
- Renaming to shorter names, which is tedious to track but still mechanical. And most minifiers have little understanding of underlying types anyway, so they are usually very conservative and rarely reuse the same mangled identifier for multiple uses. (Google Closure Compiler is a significant counterexample here, but it is also known to be much slower.)
- Constant folding and inlining, which is annoying but can be still tracked. Again, most minifiers are limited in their reasoning to do extensive constant folding and inlining.
- Language-specific transformations, like turning `a; b; c;` into `a, b, c;` and `if (a) b;` into `a && b;` whenever possible. They will be hard to understand if you don't know in advance, but there aren't too many of them anyway.
As a result, minified code still remains comparably human-readable with some note taking and perseverance. And since these transformations are mostly local, I would expect LLMs can pick them up by their own as well.
I would say the actual difficulty greatly varies. It is generally easy if you have a good guess about what the code would actually do. It would be much harder if you have nothing to guess, but usually you should have something to start with. Much like debugging, you need a detective mindset to be good at reverse engineering, and name mangling is a relatively easy obstacle to handle in this scale.
Let me give some concrete example from my old comment [1]. The full code in question was as follows, with only whitespaces added:
Many local variables should be easy to reconstruct: b -> player, c -> removePlayer, d -> playerDiv1, e -> playerDiv2, h -> playerVideo, l -> blob (we don't know which blob it is yet though). We still don't know about non-local names including t, aj, lc, Mia and m, but we are reasonably sure that it builds some DOM tree that looks like `<ytd-player><div></div><div class="ad-interrupting"><video class="html5-main-video"></div></ytd-player>`. We can also infer that `removePlayer` would be some sort of a cleanup function, as it gets eventually called in any possible control flow visible here.
Given that `a.resolve` is the final function to be executed, even later than `removePlayer`, it will be some sort of "returning" function. You will need some information about how async functions are desugared to fully understand that (and also `m.return`), but such information is not strictly necessary here. In fact, you can safely ignore `lc` and `Mia` because it eventually sets `playerVideo.src` and we are not that interested in the exact contents here. (Actually, you will fall into a rabbit hole if you are going to dissect `Mia`. Better to assume first and verify later.)
And from there you can conclude that this function constructs a certain DOM tree, sets some class after 200 ms, and then "returns" 0 if the video "ticks" or 1 on timeout, giving my initial hypothesis. I then hardened my hypothesis by looking at the blob itself, which turned out to be a 3-second-long placeholder video and fits with the supposed timeout of 5 seconds. If it were something else, then I would look further to see what I might have missed.
I believe the person you're responding to is saying that it's hard to do automated / programmatically. Yes a human can decode this trivial example without too much effort, but doing it via API in a fraction of the time and effort with a customizable amount of commentary/explanation is preferable in my opinion.
Indeed that aspect was something I failed to get initially, but I still stand by my opinion because most of my reconstruction had been local. Local "reasoning" can be often done without the actual reasoning, so while it's great that we can automate the local reasoning, it falls short of the full reasoning necessary to do the general unobfuscation.
This is, IMO, the better way to approach this problem. Minification applies rules to transform code, if we know the rules, we can reverse the process (but can't recover any lost information directly).
A nice, constrained, way to use a LLM here to enhance this solution is to ask it some variation of "what should this function be named?" and feed the output to a rename refactoring function.
You could do the same for variables, or be more holistic and ask it to rename variables and add comments (but risk the LLM changing what the code does).
How do we end up with you pasting large blocks of code and detailed step-by-step explanations of what it does, in response to someone noting that just because process A is simple, it doesn't mean inverting A is simple?
This thread is incredibly distracting, at least 4 screenfuls to get through.
I'm really tired of the motte/bailey comments on HN on AI, where the motte is "meh the AI is useless, amateurish answer thats easy to beat" and bailey is "but it didn't name a couple global variables '''correctly'''." It verges on trolling at this point, and is at best self-absorbed and making the rest of us deal with it.
Because the original reply missed three explicit adverbs to hint that this is not a general rule (EDIT: and also had mistaken my comment to be dismissive). And I believe it was not in a bad faith, so I went to give more contexts to justify my reasoning. If you are not interested in that, please just hide it because otherwise I can do nothing to improve the status quo and I personally enjoyed the entire conversation.
> As a result, minified code still remains comparably human-readable with some note taking and perseverance.
At least some of the time, simply taking it and reformatting to be unfolded and on multiple lines is useful enough to be readable/debuggable. FIXING that bug is likely more complex, because you have to find where it is in the original code, which, to my eyes, isn't always easy to spot.
As a point of order Code Minification != Code Obfuscation.
Minification does tend to obfuscate as as side effect, but it is not the goal, so reversing minification becomes much easier. Obfuscation on the other hand can minify code, but crucially that isn't the place it starts from. As the goal is different between minificaiton and obfuscation reversing them takes different efforts and I'd much rather attempt to reverse minification than I would obfuscation.
I'd also readily believe there are hundreds/thousands of examples online of reverse code minification (or here is code X, here is code X _after_ minifcation) that LLMs have ingested in their training data.
Yeah, having run some state of the art obfuscated code through ChatGPT, it still fails miserably. Even what was state of the art 20 years ago it can't make heads or tails of.
> In a 5-to-4 decision, written by Justice Neil M. Gorsuch, a majority of the justices held that the federal bankruptcy code does not authorize a liability shield for third parties in bankruptcy agreements. Justice Gorsuch was joined by Justices Clarence Thomas, Samuel A. Alito Jr., Amy Coney Barrett and Ketanji Brown Jackson.
> GORSUCH, J., delivered the opinion of the Court, in which THOMAS, ALITO, BARRETT, and JACKSON, JJ., joined. KAVANAUGH, J., filed a dissenting opinion, in which ROBERTS, C. J., and SOTOMAYOR and KAGAN, JJ., joined.
That logic doesn't make any sense unless you assume there are 100 men to every woman on earth, or alternatively, that women have 100 times more sexual partners than men. Evidence suggests otherwise.
Or put another way: you're comparing one group where 1/200 is on birth control to another group where 99/200 is on birth control. Obviously the group that has almost 100 times more people on birth control is more "efficient". That has nothing to do with sex.
But to address the issue of efficiency, I would expect male birth control is more "efficient", as women can only carry a few children at a time while men can theoretically impregnate dozens or even hundreds of women in the same period.
But I find the quibbling over efficiency to be missing the point. Male contraception offers peace of mind to some. It has nothing to do with efficiency.
> That logic doesn't make any sense unless you assume there are 100 men to every woman on earth, or alternatively, that women have 100 times more sexual partners than men. Evidence suggests otherwise.
Even if men and women have the same number of sexual partners female birth control will be more efficient if that number is more than one.
Consider a hypothetical group of 1000 men and 1000 women that consists of 500 monogamous couples, using no birth control and frequently having sex. If you randomly pick half the men and give them male birth control or randomly pick half the women and give them female birth control then you'll but the number of births approximately in half.
For that group then male and female birth control are equally efficient.
Consider a similar group, 1000 men and 1000 women, but without monogamous couples. Each person has a dozen partners they regularly have sex with. In that group giving 500 women birth control will cut the number of births in half. Giving 500 men birth control will lower the number of births some but not much.
For that group female birth control is more efficient.
QCD is about the strong forces, i.e. about the nucleus, which is a proton or a deuteron in the case of hydrogen.
However nobody has ever succeeded to make a useful simulation of the proton or of any other hadron.
Such a simulation must be based on a small number of universal parameters, e.g. the masses of the quarks etc., and it must be able to compute useful physical quantities, e.g. the masses of the hadrons, the magnetic moments of the hadrons, the energies of their excited states and so on, with a precision comparable with that of the empirical measurements.
Nobody has succeeded until now to perform such a simulation.
On the other hand, if we assign to the proton the properties determined by empirical measurements (i.e. mass and magnetic moment), quantum electrodynamics can be used to compute with high precision various properties of the proton-electron system, i.e. of the hydrogen atom.
The problem with this line of reasoning is that, even though a quantum system might have many possible states, we only observe a single one of those states at the time of measurement. If you could somehow prepare a quantum system such that it encoded the N equally-likely solutions to your classical problem, you would still need to rerun that experiment (on average) N times to get the correct answer.
Broadly speaking, quantum computing exploits the fact that states are entangled (and therefore correlated). By tweaking the circuit, you can make it so that incorrect solutions interfere destructively while the correct solution interferes constructively, making it more likely that you will measure the correct answer. (Of course this is all probabilistic, hence the need for quantum error correction.) But developing quantum algorithms is easier said than done, and there's no reason to think a priori that all classical problems can be recast in this manner.