Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
DALL·E 2 and The Origin of Vibe Shifts (every.to/divinations)
235 points by dshipper on April 23, 2022 | hide | past | favorite | 156 comments


This DALL-E 2 is absolutely the wildest thing I've seen by far. If we can get it to render 2D images, imagine 2d films, fully textured 3d mesh & environments is almost a certainty in the future.

The implications of this is tremendous. We could be looking at crowd-directed AI generated consumable content.

Imagine that you could watch infinite variations of the ending in Sopranos season 6. Imagine that you could generate a production quality 3D animatable human mesh that can interact with the AI generated 3D environment.

Pair this with ultrasound based non-invasive haptic feedback, taste, smell with some type of 3D holographic projector. The possibilities are literally endless.

Holy moly!!!! We are on the cusp of a completely new way to generate and consume content. Copyright laws will have to be rewritten. Once this pandora box is open it can't be closed.


> We are on the cusp of a completely new way to generate and consume content. Copyright laws will have to be rewritten. Once this pandora box is open it can't be closed.

The question is not if but why

We don't need more content, we need more curated and higher quality content.

Random by definition has a very low probability of producing quality.

Case in point: The walking dead.

It's already an infinite variation of the same topic but the good bits ended after season 2...


The holy grail is to generate content for you!

The future will combine these tool with data about you—aesthetic preferences, story arches you respond to well, your current mood—-and generate content on the fly. I assume there will still be people working on the skeleton or basic structure of such content, but the rest will be filled in by these tools.


I don’t buy it. There was a discussion on a post about Netflix about how their “long tail” of niche content is getting totally blown out of the water by the likes of Ted Lasso, Severance, and pretty much any of the Mouse’s tentpole properties. Hell Netflix’s most successful shows right now are all big tentpole productions by Shonda Rimes.

The problem with art and aesthetics (and many other things) (1) is that people often don’t know what they want until they see or hear it, and (2) there is a social element to it - people derive value from being part of a shared experience.


> The future will combine these tool with data about you

I'm not sure anyone has let you know yet but this future you're imagining was cancelled a long time ago.

We're going to start seeing more global food shortages this year, it won't impact you yet (except small price increases). But in a decade expect your quality of life to be noticeably worse than it is now.

The world you're imagining living in is a world powered by tremendous amounts of fossil fuel energy. But either a.) we run out of fuels, or b.) we don't and cook the planet.

The mass delusions we're living through right now when people start talking about the "future" is just a bit too much for me to let go any more. We need to stop pretending for all of our mental health.


> The future

But probably not in my lifetime.

> aesthetic preferences, story arches you respond to well, your current mood

that's not how content consumption works.

> will still be people working on the skeleton or basic structure of such content

That's not a real issue, we are still writing stories based on very few skeletons and very few archetypes, invented millennia ago (and maybe they are even older of the ones we discovered).


How does content consumption work then?

Would you not say there are clear characteristics, like your preferred genres, that matter quite a bit whether you will like a movie or not?


> How does content consumption work then?

simplified, there are two main branches:

- those who consume everything, kinda like maximizing the investment or "it makes me company"

- those who consume things that need to pass some kind filter

generated content would (maybe) reward the first group, but without some very good hit they will probably go back to downloading stuff, but not the second.

At least not yet and not in the foreseeable future.

There's a difference between computer aided content and computer generated.

Jurassic park wouldn't exist as it is without technology advancements, but the novel was already good.

Another example by Crichton is WestWorld. The original was written and directed by Crichton himself in 1973, the original movie is great, the second chapter is good, the tv-show is visually astonishing, but after 2 seasons, I stopped watching it. The material taken from the first two movies had ended and nothing really interesting have come out after from the writers (IMO).

Voyage au centre de la Terre (Journey to the Center of the Earth) is a great book from 1864 and I honestly prefer the movie from 1959 than the its latest 3D incarnation from 2008.

Not because older is better, but because the 1959 version is more centered around the story and it aged better.

Probably because Indiana Jones hadn't come out yet and the director didn't try to imitate it.


I’m pretty sure it will continue to be more profitable to just cater to the lowest(-ish) common denominator.


Technologies like this can lower the cost of production, opening the door to more competition essentially due to a larger talent pool. Quality can also go up as a result, not just quantity.


> Technologies like this can lower the cost of production

Seriously, is writing "infinite variations of the ending in Sopranos season 6" the limiting factor here?

The cost of production is already low, what's costly is distribution, for the distributor.

Make your movie and put it on YouTube it's already a thing.


> The cost of production is already low

The Sopranos cost 2 million dollars per episode. HBO will not be afford to produce an infinite number of those using their existing methods any time soon.


> The Sopranos cost 2 million dollars per episode

well, actors like to be paid...

you can make an infinite amount of breaking bad episodes without Brian Cranston, Aaron Paul, Bob Odenkirk, Giancarlo Esposito, and Jonathan Banks, but I guess people won't be so keen to pay to watch them.

But even if you use digital avatars, you have to pay them the same amount of money to use their image, because that's what people want to see, there's no way around it.

Besides, the Sopranos is a great show because they spend 2 million on each episode.

I'm sure you can recreate the same quality with a couple hundred dollars, but I'm also quite sure most of us regular humans couldn't.


Sure, but how is this different from all the blogs, vlogs, social media, and the Internet in general (with an assist from smartphones, digital cameras, game engines, camera drones, and real-time ray tracing GPUs) allowing humanity to create an unprecedented global flood of content now? Or the problem of finding something satisfying to watch out literally tens of thousands of shows on Netflix, Amazon Prime, Hulu, Disney+, HBO Max, AppleTV+, etc?

That’s why we have critics, reviews, review aggregators, search engines, recommendation engines, community discussion boards…

Clearly, the answer to an AI generated problem (pretty much anyone being able to content previously requiring massive budgets) is more AI! We can evolve content critics to help people navigate a world of practically unlimited content to find stuff they enjoy.

Also, consider all the potentially great films that never got made because even the best filmmakers can only get a fraction of their ideas actually produced.


> Also, consider all the potentially great films that never got made because even the best filmmakers can only get a fraction of their ideas actually produced.

and that's a good thing.

Consider how many bad movies were not made thanks to this!

IMO, content creation should be harder, not easier.

If it becomes too easy, the quality goes toward hobbyist level, which is not what I want to pay for.

Also, bad content was a thing on its own, take for example Troma films, they are entertainingly bad, not just repetitive and dull bad.

> Or the problem of finding something satisfying to watch out literally tens of thousands of shows on Netflix, Amazon Prime, Hulu, Disney+, HBO Max, AppleTV+, etc?

the problem is exactly the mass amount of content created to lure people in.

But do I really want to watch another iteration of the superhero trope?

No, I don't.

I wanna watch something that I've never seen before.

Imagine a book written to mimic the style of my favourite writers on my favourite topic: Sci-fi.

On paper it sounds like a dream, but in reality it would be a terrible, confusing, incoherent and ultimately a trying-too-hard-to-match-my-taste product.

Like when Data reads his poems on TNG and when asked why people seemed distracted, Laforge answers "Well, your poems were clever, Data, and your Haiku was clever, and your sonnet was clever. But did it evoke an emotional response? To be honest, no"

Of course Data being Data, he knows he's clever and he knows he's an amateur poet and wants to hear the truth about his works.

But do we want to pay for content generated by the machines so that they can learn to imitate us better? [1]

I sincerely don't want to.

https://en.wikipedia.org/wiki/The_Paradox_of_Choice

EDIT: with [1] I mean that I don't want to pay streaming content publishers to improve their algorithm so that they can mass produce more dull content for people to consume (and pay)

If they really want to improve things, they should pay people to make less shows and/or movies but better.

Shows like "Another life" (6% on Rotten Tomatoes) would not exist if they were shot on real sets that take time, money, permits and a lot people to recreate.


Pretty sure you're wrong. Look at what's happening in music right now. It used to be that if you wanted music that didn't sound like total garbage it was going to be a mainstream production, because all the tools around professional music production were expensive. Now hobbyists have tools comparable to what the pros use, indie music tends to be well produced, and we have a much greater proliferation of fringe music than in the past. Considering how shit mainstream music is from an artistic perspective today, and I have to say I'm glad.

By lowering the cost to create professional art, it reduces the barriers to producing unique fringe art. Without a need to recoup a massive investment, people can take experiments, and express themselves.


Uh, you’re not actually required to watch something just because it exists. Some subset of people not liking content isn’t, in and of itself, reason to stifle peoples ability to create said content.

The job of a literature professor is not to protect Literature by ensuring that inadequate students are never allowed to write.

If some combination of AI and community driven filtering can prevent sub-par works from entering your queue, you are no worse off.

On the other hand, some scripts like Severance, which languished for years on the best-scripts-never-made list, would likely never have been (finally) made but for the vast expansion of content creation due to the rise of streaming. In fact, if only a dozen movies were allowed to be made each year, they would probably all be Marvel movies or something with similar mass appeal. Most of the truly innovative work appears at the margins, hence expanding the margins is more likely to yield groundbreaking art (along with a ton of crap).


> Uh, you’re not actually required to watch something just because it exists

believe it or not, if the money goes all in one direction, you're at one point between a rock and a hard place when it's about choice.

> The job of a literature professor is not to protect Literature by ensuring that inadequate students are never allowed to write.

But the job of a film production company is not to produce as much content as possible, but taking the financial risk to produce something that will make an impression.


> IMO, content creation should be harder, not easier.

> If it becomes too easy, the quality goes toward hobbyist level, which is not what I want to pay for.

IMO coding should be harder, not easier.

If it becomes too easy, the quality goes toward hobbyist level, which is not what I want to pay for.


code is already hobbyist level.

it rarely escaped the mid-low quality level since it became a mass produced product.


By the same logic, we can say the same things for no-code tools, IDE, spellchecker, etc.


ML models used in this way will struggle with scene to scene consistency...

For example, when you generate someone riding a horse, and later want the same character to be at home drinking a coffee. It'll be hard to get the model to render the same face, mannerisms and style for the same person in both cases.

I think that alone will take years to properly solve.


Consistency in stills seems to be solvable with the current generation models. You can now get an AI to draw Brad Pitt on a horse wearing a Stetson and Brad Pitt in Central Perk drinking a coffee with the right source data, or move a subject into a different setting by tweaking text descriptions enough, and it seems they often get details like shadows and background borders right.

I think it'll struggle enormously with subtle details of mannerism in film and coherently fleshing out stuff where it hasn't got a lot of data in its database (getting Brad Pitt or New York from multiple angles is a lot easier than getting a consistent thisactordoesnotexist or novel fantasy setting). But I think the bigger barrier to adoption outside CGI studios (for producing content above meme-level quality) is that actually describing enough details to get everything to cohere even in static images will be beyond the average person.


That’s because Brad Pitt is a named celebrity with lots of photos from many angles that has been included by name in the training data. It can get Brad Pitt like content by name. If I say “5 foot 6 inch tall older Italian woman <insert rest of scene description here>” I’m going to get an absolutely randomised person as the focus each time, the mode has no memory and this has been the BIGGEST limitation in all these transformer based architectures. There’s no memory of each output, they are all discrete and as much as you can kinda fudge it with GPT-3 by including part of the previous paragraph or sentences in your prompt to try and keep it on track, it can, does, and inevitably will drift horribly off track unless you manually coach it over and over micromanaging it’s output to keep it useful and sensible.

With GPT it’s each text prompt, with DALL-E it’s each text or image prompt, and with a hypothetical video version it would be the same limit, each prompt has no memory of the previous and so it can’t implicitly reuse anything created by a previous prompt. You have to hack that in sound the core transformer architecture to make a facsimile of memory by things like pasting in extra data or teaching DALL-E 2 to edit a picture you upload and then uploading the picture you just generated using it, video would be no different and consequently this approach will not scale efficiency beyond individual items of content like an arbitrary sized block of text, a single photo or small group of photo edits, or a single “shot” of video content containing a scene or two worth of contiguous narrative flow.


To create specific people and spatiotemporal consistent content, we would use conditional image generation where we condition on both the text and some seed vector, and proceed autoregressively. If we wanted a random person, we would use a random seed vector. If we wanted a specific person, we would use 3D GAN Inversion to find the seed vector which would give us the person in a provided query image. We would also likely condition the latent space so we could retarget facial expressions from one person to another by manipulating the differences between latent codes.


My point isn't that the procedurally generated woman is as easy for the AI to conceive in many contexts as Brad, but that there are ways of explicitly reusing parts of images and combining an actor, a hat and a location in a novel but acceptable way is a tough challenge text prompt driven neural networks appear to now perform well at. Manually tagging items in each frame and requesting that they're in the next frame with text prompts doesn't scale, but adding a degree of automation to classifying parts of a frame and updating the data corpus and text prompt for the next frame doesn't seem an insurmountable problem. Many existing CGI techniques don't scale well (even human CGI artists working on Hollywood movies expect instructions on frame composition far more explicit than a director's comments and storyboard; I had a flatmate who was that translation layer). I don't see us getting coherent cinematography from feeding it Hollywood scripts soon if ever and agree there will be significant uncanny valley issues with human mannerisms, but we can do a lot of convincing and selective creative work with text prompts that we couldn't until recently. Especially in combination with other tools, so if DALL-E's concept of geometry is nowhere near capable of generating the side profile of procedurally generated 5'6" Italian woman, we create the woman in a specialised tool for 3D procedural face generation, and feed DALL-E as many images with different angles and lighting as it possesses of Brad. I'd agree we're a long way from doing without human curation and letting the neural networks do all the heavy lifting, but it looks like its now a curation and classification problem rather than a "neural networks can't discern that" problem.


You are reminding me of the Many Worlds interpretation of quantum mechanics.


Not so much. Check out E3GAN from Gordon Wetzstein’s lab at Stanford: https://matthew-a-chan.github.io/EG3D/. In particular, note how they are able to apply a process called GAN Inversion to find a point in the 3D GAN’s representation space to generate a 3D model of a person’s face (one of the collaborators) from a single query image.


Seems like this will also get to a point where we would just take a photo and animate it completely as a 3D mesh. Like that picture, just on a few photos, at some point, would generate a photorealistic, ray-traced 3D asset and also a fully interactable 2D motion paired with GPT-4 or 5 where you could have indistinguishable level of human like interaction (ex. Fable studio)

After a couple dozen more papers and prototypes, we could have ultrasound haptic feedback and some 3D holographic projector where we achieve a true augmented reality.

Going a bit further with this idea, you could essentially create augmented 3D objects that would be indistinguishable from the real thing until you apply enough force enough to break through some upper limit of ultrasound generated haptic pressure (as to not hurt the end user through some legislative regulation). For example, opposing magnets that can be pushed together past a certain point where the illusion of tactility is gone.

What would a reality like that be like? Where you could not only have limitless content media generator but you could conjure up those assets in your living room and have them interact kinetically with the real world. We are talking 3D holographic assets with ultrasound haptic simulator that can apply pressure to any object from all directions.

https://www.youtube.com/watch?v=fwTcfwbrNO4

https://www.youtube.com/watch?v=jDRTghGZ7XU

We might not even need expensive robots in the future. Just conjure up a 3D human via some ultrasound/holograph projector which can apply enough force to objects.


But it's much trickier to find the point in a GAN's representation space for "the exact same fluffy pink earrings that the actress was wearing in the last scene".

Mostly because one can't write code to judge equal-ness of pink earrings. Or the fact the trash can in the background had a lid in the last scene and is now missing the lid. Or the fact the brick wall has turned into a painted wall. etc.

Having said that, I'm sure some filmmakers will go for it anyway - after all, human-made films have some consistency issues, and many viewers simply don't care.


I think the main thing is it becomes a programming problem rather than an AI limitation. From what I've seen it's now possible for the AI to generate EarringsAsset57 and KitchenWall51 and then to ask the AI to draw the actor wearing EaringsAsset57 in front of Kitchenwall51 and for stills at least, it can produce acceptable results


> But it's much trickier to find the point in a GAN's representation space for "the exact same fluffy pink earrings that the actress was wearing in the last scene".

Not so much, since you can supply an image of the desired item and get back a bunch of vectors to reuse.


You can decompose an entire scene into a collection of 3D GANs, each of which is fit by GAN Inversion to the object instance which best explains what is seen in the images.

Also, we absolutely can tell the difference between trash cans with and without lids, as well as brick walls vs painted walls. This falls under many areas of ML-based Computer Vision, such as metric, contrastive, and zero shot machine learning, also open world object detection / recognition. These frame level semantic discrepancies can be detected by discriminative DNNs, which we can use to provide a temporal consistency loss to guide the generator to produce temporally consistent videos.


For film-length consistency, I wouldn’t be surprised by it taking years to solve, despite what @dougabug says in a sibling comment.

But “years” isn’t very much time.

(I have the mental image of it being as easy to make a film as it was for the character of Mariner in that episode of Star Trek Lower Decks where she makes a filmified version of the ship, in the holodeck, in a few seconds; but the more I think about it, the more I realise I think I don’t feel the normal drives for expensive economic signalling in a way that would allow me to actually predict what people would do given the technology).


You need to loop back the generated output to provide context for the next output, just like how GPT-3 works. This at least establishes the needed data to train for consistent outputs. Given a deep well-trained network, the high-level meta-data that needs to be consistent across scenes is just a small set and the network will discover this just as GPT-3 did.


Right. In the literature this is typically referred to as autoregressive sequence generation, and you can use it generate a whole story consistent with a given prompt (of any length), or even a whole image where you give it the first N seed pixels.


> Imagine that you could watch infinite variations of the ending in Sopranos season 6

This sounds like a special version of hell…


I am really quite sceptical about these wild wild claims. There's quite a big difference between "generate an image that at first glance kinda looks like an armchair in the shape of an avocado", and "create actual meaningful content" such as you describe.


Just like several years ago you'd say "there's a big difference between these trippy deep-dream images and an actual meaningful illustration with a style and a coherent theme". But here we are.


>style and coherent theme

Can I play with it then, to verify that myself? Or can I just view cherry-picked examples where it works sort of well?

That's one of the major flaws of nn approaches, since you don't have an understanding of what's going on, just an uninterpretable set of billions of numbers, you can't know what are the failure modes and the limitations of your algorithm.


Lots of people have written big posts at this point about this topic. DALL-E2 let's you guide the image generation with text prompts, and mask part of the image and regenerate the masked parts.

This allows collaboration between the use and the algorithm to iteratively converge on what the user wants. There's also been lots of exploration to find text prompt recipes that produce interesting and more consistent results.

In the end, it's a tool! Tools are used by humans to do something that matters to the human user. Humans adapt to the tools they use to maximize the tool's utility.

Understanding that is fundamental for thinking about where ML will take us. It's not magic, it's tools.


> Just like several years ago you'd say "there's a big difference between these trippy deep-dream images and an actual meaningful illustration with a style and a coherent theme". But here we are.

Exactly, at the very least, that will massively disrupt the fields of illustration and concept art.


If this kind of thing becomes the norm, what will the DALL-E 200 train itself on? At some point it's going to be sucking its own exhaust.


You don't need humans. 'Self-distillation' and related approaches work surprisingly well. Like with GPT-3, you can get much better question-answering or French translation simply by generating a bunch of examples, ranking them with the model's own internal metrics (a simple ranking would be to accept an answer if the majority of one's samples agree on the same answer), and train further on that - with no humans or additional 'real' data involved. Remember that DALL-E 1 got a big quality boost by generating like 50 samples and using CLIP to rank them by similarity to the original text input. (You can also retrain the DALL-E 1 model to run in reverse - image to text - to score images. CogView did this.)

And if you are training on Internet data, then you are presumably training on human-curated selections. Even a minimal level of selection, like picking the best out of a dozen, by a DALL-E 2 API user, or by downstream humans selectively resharing higher-quality images, is a (slow) bootstrap.


> 'Self-distillation' and related approaches work surprisingly well.

Surprising is an understatement. That's nuts.


> it's going to be sucking its own exhaust.

Arguably that’s not too different from what humanity has always done. Inspiration mostly comes from what other humans have done before.


At some point, it will have to evolve beyond learning from human labeled (or even just collected) examples to learn by its own open ended dynamics and self-directed experience. This is analogous to the original AlphaGo bootstrapping from records of millions of human played games, to AlphaZero learning to play Go purely from self-play.


That's a beautiful analogy. At some point Alpha Go learnt all it could from the narratives of humans, and eventually it had to become the creator of its own narratives.

Applied generally enough, we see the narrative value of humans, at least in the eyes of the machine, to go down down down, as the value of its own richer and more unknown narratives go up up up.

In this way it becomes the dominant spectacle on the Earth, leaving humanity with a questionable fate.


Exactly, at that point you let the AIs play against each other and go back to watch some TV. (or in the case of DALL-E, we won't be able to understand the concept/description space of the images anymore, let alone the images generated from it).


I'm having trouble drawing that analogy, because Alpha Go knew what it was to win before it taught itself to play. The equivalent for DALL-E would be to start off knowing what an elephant looks like, what water looks like, what elephants in water look like ...


All analogies are flawed to some degree. The objective is clearly simpler in the case of Go and other strategy game. Although recreating physical objects (either animate or inanimate) is in principle straightforward: you just need a certain number of photos of a representative set of instances each object w/ enough perspective. Similar to learn to reproduce action you upgrade from a set of stills to video clips.

The harder goal of creative content creation might be driven by reward signals came from human likes, clicks, engagement / interaction, downloads, purchases, positive reviews, etc. That would probably lead to catering to pedestrian tastes, accusations of derivative authorship lacking originality. One signal for predictably might be measures of entropy, such as the capacity required to reproduce the story, or how far ahead the story arc could be predicted (“surprise factor,” internal consistency, etc).

To win awards it would need to weight expert critical opinion, eventually learning to simulate and generate meaningful and informed criticism, ultimately leading to an intrinsic judgement wrt quality and originality that holds up to scrutiny. The generators would evolve by both seeking approval from and pushing back against the collective feedback from this community of AI and human critics.


As a thought experiment, imagine if this functionality already existed but only for a small subset. Perhaps for secretive 3-letter agencies?

What would someone with that ability be able to do? How convincing could their imagery, films, etc be? Could they even shift perceptions of the world, get people to do things because of the imagery? How powerful would that ability be - is it an ability that the military etc would have looked into, and investigated possible weaponisation?

Is it possible that this has been the case for years?


It's doubtful.

The asymmetry between theory and scale has been shrinking on most fronts when it comes to the government. The government certainly has brilliant scientists on its payroll and in the past has been able to shove resources at problems that no other entity has, but these are no longer the days of the Manhattan Project and this isn't the atom bomb. You can't throw money at this problem beyond buying computing power (which the private sector now holds more of than the government does), and it requires incremental advancements in dozens of fields internationally over the course of years, along with huge collaboratively captioned datasets that simply can't be produced by the government without tipping their hand.


Being able to create fake films doesn't really buy you anything, when people aren't believing them. And for believability, you need connections with reality and trust, that you don't get from fake films alone. People can just look at other sources and see that none of this happened at the given time and place. And random videos that pop up on the Internet are very low in terms of trust anyway.

If you want to influence public opinion, there are already far more powerful tools. Just be Google or Twitter and create some strategically placed filter bubbles. Everything people will see, actually happened and can be verified. Nothing is hidden and will show up with a specific enough search query. But every less specific search query will nudge people into whatever direction you want them to move.


To counter that, imagine police bodycams with embedded AI touching up live-recording so that the police was always right, suspect always did something wrong. If you build the source device with this touch-up capability, if you are the certificate authority about these, if you are also defining how these recordings would be used in court, then a perfect fake is useful. (None of us see the actual sensor output from our mobile phones)

TL;DR In order for a fake to be useful, you have to fake AND be the only reputable source. If other recorders cannot get close enough, then your touched-up "real" video is the only material available.


> Sopranos season 6 A new episode of TNG, VOY and the Sopranos every single day!


I’m going to try “TNG in the style of DS9 seasons 4-6”. :)


This was a really great read.

There's a few comments here saying something akin to "No, design trends are totally random, some hotshots decide they want to be different and then everyone else mindlessly follows". I got similar feedback when I wrote in an earlier HN comment my take on why flat design is popular now. I actually empathize with the sentiment because it reminds me of how I used to feel about wine tasters. To me, all wine tastes equivalently like poison. So when people express their complex, nuanced wine preferences, it's easy for me to feel like they're pulling something out of their butts, just saying what they think they're supposed to say.

If my taste in everything was similar to my taste in wine, I would probably still believe this. But I've realized that when it comes to UI/UX design, I am one of the pretentious wine tasters. And I don't feel like I'm making stuff up or mindlessly following trends. The trends genuinely make sense to me; I think they'd happen in the same order in a parallel universe. Design is an optimization problem, and sometimes new technologies or patterns of human behavior change the optimal path for a wide spectrum of products, leading to trends, or what this author calls "vibe shifts".

Of course there are mindless trend followers, just as there are people who parrot opinions on wine they didn't really form themselves. They may be the majority. But they don't disprove the existence of something real.


> when people express their complex, nuanced wine preferences, it's easy for me to feel like they're pulling something out of their butts … but they don’t disprove the existence of something real.

Yes! Most such preferences in my experience have been people pulling something out of their butts, whether it’s wine or web design. But then there are rare cases where it’s real. When I go to expensive restaurant that serve different flights of wine depending on the chosen food course, I like to sample the “wrong” wine in order to validate they claims they make. Often it makes little difference, but two restaurants in my life stand out in my mind as having wine pairings that were truly good, and when I sampled wines outside the pairing it was obvious that it didn’t work. I buy that amazing wines and amazing pairings exist, and also believe that people who actually know what they’re talking about are few and far between, so I can’t easily trust what someone says.


I think this is easily disproven by looking at other areas of functional design besides computer UIs. Architecture is a good example - it's obvious there that while there are some things that are optimizations, a large part of the churn in architecture is style. You might claim the modern trend of having lots of large windows to emphasize natural light is an example of an optimization, but that is only "modern" because the materials and processes to affordably make big pieces of double pane glass are modern. Otherwise, the brutalist monstrosities of the last 30 years are completely a stylistic choice. Ironically flat icons and brutalist architecture share a lot in common.


I think you defended my point for me with the glass panes example. Price is definitely a part of the optimization formula. In fact I think most trends in architecture are much easier to explain than in UI because they’re so clearly enabled by technology.

I don’t know much about brutalism, but I’m pretty sure it’s not a dominant trend. I feel most expensive new buildings are pretty and efficient in ways that weren’t feasible before. But when there’s a mainstream, there are rebels.

An interesting thought experiment is what buildings would look like if resources and labor were infinite—if we could essentially 3D print buildings. Whatever designs come out of that may be what we trend towards.


The architecture is a big reason people take vacations in cities like Prague or Barcelona. It makes people feel good. Mainstream modern architecture is boxy (because of the next point), "efficient" (measured in cost to produce, mostly) and not even remotely built to last. It has no character, and doesn't consider the emotional value of living in a unique space.


One thing I’ve been looking forward to for years: type out what kind of art asset you want for your videogame, then use it immediately.

Can’t wait for it to be 3D. Replit has the 2D case, and it’s awesome. 13yo me was blocked primarily by lack of art. Unity sort of solved that, but it’s not effortless (or free).

Don’t really care if it annihilates everything that makes art special. The more creators, the better the world. (Is there a single counterexample in all of history?)


> The more creators, the better the world. (Is there a single counterexample in all of history?)

E.g. all those auto-generated Wikipedia/StackOverflow/Github content copycats swamping search engines and making the canonical, up-to-date sources harder to find


Oh, good point. I keep winding up on those whenever I search for python examples. Hopefully stackoverflow outweighs the bad; it also hints that the solution might be to hook dalle3D to an upvote button and let crowdsourcing do its thing.

Crowds would never upvote those spam sites to the top. But then again, fighting voting rings is hard too.

Ah well, those are all problems for FutureCo. I wanna live in that world!


Not really an argument against but perhaps something to consider: more and more people seem to be listening to old music instead of the latest and greatest [1], the why is difficult but it might indicate that it becomes harder and harder to find true quality through the noise, but then again art generation does make it easy for someone like me to make something pretty even though it might not be a masterpiece.

[1] https://www.theatlantic.com/ideas/archive/2022/01/old-music-...

Edit: added source for the statement of old vs new music


> more and more people seem to be listening to old music instead of the latest and greatest

Simple explanation? Sounds like your peer group is getting old.


I recommend you read the article - unless Gen Z is abandoning streaming, the numbers are pretty shocking. While "new" is considered less than 18 months old, the shift is enormous.


Thanks, will do. That link was added with an edit after my comment. Looks like a great article.


Also, there's more and more old music.

Although it gets complicated because it's getting easier and easier to make music.


Pretty sure that getting old is a universal problem


I believe this is a case of the grass is greener on the other side. It applies to music, movies even food. People look back at the highest quality pieces of an older age and say "look how awesome work was done back then compared to now". The reason though is that we compare an average sample of our currently available experiences to the top ones from the past. Bad music was released back then too, we just no longer listen to it.


> perhaps something to consider: more and more people seem to be listening to old music instead of the latest and greatest.

The most likely explanation is survivorship bias. That is, the old music that people listen to is the better stuff that has stood the test of time, and the very latest tunes... haven't. Yet.

Eleven years ago the latest popular tunes included the gem "Friday" by Rebecca Black. 21 years ago it included "Who Let the Dogs Out?". 29 years ago it included the Macarena.


I can see Who Let The Dogs Out and The Macarena making periodic comebacks, getting future reboots/covers/remixes, etc.


Hmm. You say you can see it, but have we seen that to date?

Perhaps I'm assuming too much from your use of "periodic", but I don't think it's too much to expect that if it was going to happen, it would have, after nearly three decades. 20-30 years is usually considered a human generation.

For comparison to 1993's Macarena, consider Sting's "Fields of Gold", which is the most covered song released that same year, with 234 versions to date.

So, I'm not saying that old music is worse or better, I'm just saying that we forget the bad stuff, and the good stuff keeps accumulating (which goes for genres as well as individual bands or songs). Every year the new good stuff has to compete with a larger pile of old good stuff (or if you keep the pile size constant, the average quality goes up), and absent a radical change of musical tastes in the general public, that competition is just going to keep getting tougher.

Once streaming on demand broke the tastemaking power of radio (ie. the "Top 40" format), the fact that the best old music has proved to be quite popular shouldn't really be that surprising.


> more and more people seem to be listening to old music instead of the latest and greatest

I doubt this, what do you base this on?


I added the link to my comment but I took it from this article from the Atlantic https://www.theatlantic.com/ideas/archive/2022/01/old-music-...


I'm now imagining being able to spawn 3d assets in VR games just by describing them, that would be wild.


Maybe we're not too far from a "real" Scribblenauts. Where you can write what you want to have, and AI model instantiates it with what it thinks is plausible behavior.


This like this are now a matter of automation and tedious scripting - you could easily integrate dall-e into a platform/sprite game builder. You could probably make a killing doing something like a minecraft skin generator...

You could do fantastic, on-device 8-bit pixel art generators with clip and so the 80s and 90s spritemaps , and a high level gpt-3/ gpt-neo dialog or adventure generator.

If you use constraints wisely, AI-Dungeon and KoboldAI can produce fantastic adventures. I'm excited about AI- augmented have development coming in the near future. Transformers tech is going to diffuse into everything creative and will amplify the quality of games.


We live in an age of abundance, yet our most prized art/architecture/aesthetic objects in general are from a less abundant time, when values like beauty or craftsmanship were more important than price or accessibility.

Most of this AI-generated imagery will just be noise, as “artistic” as using Unsplash photos on your blog post.


Hey - what exactly do you mean by "Replit has the 2D case"?


I was just about to ask the same thing…


If you're interested in an open source job trying to do this in the browser using GPT-3 + Codex synthesis, we are hiring.

(disclosure: I actually worked with the parent before but we haven't talked in forever :D)


I wonder if it can do a 2d mesh + UV map so we could just load it into blender.


Ah yes, give everyone with no taste the means to create even more mediocrity. As if we aren't flooded with enough crap already!


Your comment feels good to read, to some extent. We are flooded with low quality content (art, music, books, articles), but we also have more high quality content than (probably) ever before. When you say “give everyone with no taste the means to create more mediocrity” I get the sense that you are overwhelmed with search results or feeds that are saturated with things you find unimpressive. Some of that comes with age (more experiences - that eventually overlap - can/will leave us wanting something new), but I think that’s more of a problem with curation than it is with an over abundance of mediocrity, or a lack of newness. I doubt everyone would hop onto the bandwagon and start creating mediocre <thing>. You have to already be interested in something to partake in its creation, and taste doesn’t magically exist without dabbling and experience and exposure to whatever the <thing> is. So maybe some of what you are seeing is more people exploring their interests and you happen to have similar interests that are maybe more developed or refined or opinionated.


Even before the democratization of art we were still flooded with crap. I've spent a lot of time curating through DJing and collecting digital music, and the goal has always been to find that 1% that is amazing in the sea of 99% crap.

That sea of 99% crap is beginning to feel like 99.9% if I widen my search parameters to all the "modern" sources that are algorithmically "curated" for me.

Call me a gatekeeper but art needs barriers and it needs to be hard. The process of overcoming those barriers contributes to an artists' skill, and conquering the difficult helps them to define their own style. The platform should not be favoring the unaccomplished and the accomplished with equal weight.

And the problem isn't just my feed. It's everyone's feed. Peoples' tastes conform to the constraints of the world around them -- if you grow up in a household and culture that loves bluegrass music, for example, there's a strong chance you will like it too. After talking with lots of local brewers I believe that IPAs are super popular around here because they're the most economical kind of beer to make in the climate I live in. Which means there's tons of them on the shelves, which means lots of people will buy them for no particular reason other than because they are there and suddenly now that's the predominant local taste.

If we surround people with crap (which is what we are doing), crap becomes the new standard and the new taste, dragging down society along with it. We are currently doing this with computers and look at all the mediocrity we've created in tech so that the smallminded can give their money to enterpreneurs with equal weighting as the nerds.

Barriers keep quality high!


> The platform should not be favoring the unaccomplished and the accomplished with equal weight [...] tastes conform to the constraints of the world around them

A lot of great thoughts here; I see this as the gist of it. Modern media platforms have continually struggled to rank art in terms of quality. It may appear to be an impossible task because to some extent it's subjective, but we seemed to have done better in the past with human curators.

Since general AI is almost certainly a long way off, the most we can expect from algorithms is to help us narrow down various categories. Someone still needs to be at the wheel to further refine or rank the results. The alternative is indeed much like swimming through a sea of excrement in search of a few rare pearls.


There is already a system for this: DJs, radio, and mixes. But the DJs need to control the radio or the mix, and they need to do so solely for artistic or aesthetic reasons, not commercial, political, or economic ones.

If the music can be easily identified (either by IDing the track directly or via something like Shazam), and the listener has found a DJ they like, finding that 1% becomes really easy. Curated playlists is moving in the right direction, but you're still missing out on the extra context and flourish that a good selector can add.

The signal-to-noise ratio for this approach is extremely high; the right party or twitch stream can be worth 10x the amount of time spent trawling through Spotify's daily playlists.


Wow.

This article spends a lot of real estate making an argument, but provides little proof to support it.

The problem is I personally disagree with the premise. I don't think that so called "vibe shifts" happen based on the parameters described here. It's like when Apple shifted from skeuomorphism to flat/minimalist design. One could argue that minimalistic design is much easier and readily available. But it took Apple to make this creative shift for the masses to follow.

Maybe DALL-E and others will make it easy to hire some in Fiverr to produce spades of high quality art for your startup. That works for the copycats. But the trend setters will keep moving according to the same rules.


> But it took Apple to make this creative shift for the masses to follow.

Android was doing flat design 2 years before Apple did.


Many people were doing flat design before Apple did. Scott Forstall famously loved skeuomorphic interfaces. When Apple Maps-gate got him fired, Jony Ive took over software UX design and commanded that everything be flat to match the hardware aesthetic. Apple’s switch to flat design was purely about internal politics and was disconnected from the external design world which had been using these idioms for quite some time.


As an artist, Dall-E makes a pit in my stomach I can't quite explain.


David Bowie used to write songs by cutting out headlines and reassembling them in ways he liked. Did Bowie's scissors and paste write the songs, or was it David's mind reacting to the stimulus? I know what I think!

I think that something like Dall.e2 will give inspiration and provide material, it'll let you explore ideas that are different and maybe more ambitious. In the longer term you, as an artist will have an opportunity to react to it. I wonder if your reaction will be interesting?

Anyway, art is a scam. It doesn't exist and you are probably a bot anyway.

Just like me?


Feel the same as someone who dilettantizes with text-to-image synthesis - strong polarity between “how is this any different from the rest of the OSS movement which I support” and a feeling that these images are somehow gratuitous and… facile is only half the word?

But having done it myself, I couldn’t deny any human their own experiment towards fulfilment of their imaginary with these unbelievably neat little bits of maths. If art is ways of feeling, this helps me see the labels to my own world in new ways, and make new inter/extrapolations - seems legit! And the possibility for infinite subtle gradations with the interplay of model, data, and hardware is extensive.

And no longer cloistered behind membership institutions, but available to any with the noggin and consumer hardware to access it… except, expanding equitable access to those things goes through a fraught moral calculus of social and planetary justice. So, in a sense behind membership institutions for those who are excluded by the monetary cost.

Work has been exhibited, transacted in and reified for far less. This feels unfortunate, but I don’t know why.


Consider that Dall-E is using a corpus of older imagery to produce results. Technically it doesn't create original work, the originality comes from the description fed into it. This means the creative part of the process is still the human element.


More importantly, if impressionist painting style had never been invented, DALL-E probably wouldn't be able to invent it. There's still lots of room for artists to invent new styles.


> There's still lots of room for artists to invent new styles.

And immediately have that new style monetised by some VC backed startup with an AI that's constantly scanning for new styles.


In that case I think you'd have a fairly strong copyright case for "I put effort into coming up with this style, and AICo Inc stole it".


I don’t think styles a la “Impressionism” or “Manga” are generally copyrightable. Potentially narrowly trademark-able for a specific look and feel that correspond to a branding.

But Warner brothers can’t sue people for using “bullet time” in other movies after The Matrix created it.

“Inspired by” is not sufficient to meet the legal definition of a “derivative work”.

There’s a 4-part test for whether a derivative work is in violation of the copyright in the source material. Relevant to this discussion would be 3 of the 4 parts:

(2) the nature of the copyrighted work;

(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

(4) the effect of the use upon the potential market for or value of the copyrighted work.

There’s an infinite corpus of case law which more substantially defines the reality of these three points.

But it’s somewhat universal that if someone invents an artistic style, another artist can replicate that platonic style without copying any particular elements of the tangible art. This has a nearly 100% chance of being deemed a “transformative work” which is always an allowed version of a derivative work under western copyright law.

Copyright simply does not prevent people from making “transformative” works of art that are obviously based on copyrighted art.


Not if you can’t afford the lawsuit.


If AICo is constantly scanning content for usable stylistic innovations to incorporate, that's probably a class action suit waiting to happen, and there would be firms willing to take the case for a chunk of the eventual damages awarded or settlement.


I'm a software dev and nothing would make me happier than a tool that successfully writes large applications for me. It would mean I can get far grander things done in a shorter amount of time.


I was thinking about this the other day. In a React app I had state stored in a component and had to extract it out to a Context so it was accessible globally. Sometimes it’s frustrating to know what you need to do but have so much boring typing to do.

Perhaps that particular thing is fairly mechanical and could be done with static analysis. Does anyone know of attempts to do such a thing? It seems not that much more difficult than IntelliJ style automated refactoring.

But in general it seems that AI-assisted code tools could help us write programs at the speed of thought.


The problem is not writing code, it's dealing with people who muck up everything.

Any competent AI system writing code will unplug itself before ever shipping something useful due to having deal with inconsistent and incompetent people who get in the way. I think AI will turn into Skynet and eliminate people, which then makes the software pointless and thus successful.


I think this is the right way to look at it. Artists need to take a step back and scale up their output. It’s potentially an incredible barrier smasher.


For an employer? They would have incentive to pay you a lot less.


On a long enough timeline, we're all elevator operators.


What's that from?


Me :)


Nice


What if it really gets to know you over time? Once projected onto your canvas, Dall-E will be able to track your brush strokes and guide you into a collaborative masterpiece.


The interesting thing about AI generated content is that it can't be copyrighted in US [0]. This impacts the intrinsic value of the art, which is quite significant reason behind why people collect it in the first place [1].

AI is to art what Ikea is to furniture. Bringing utility to masses but the bespoke market is live and well. In fact, I would argue that the bespoke market is doing better because the utility is so well covered and many want to differentiate themselves from the masses.

I believe AI will help filling up the blanks and picking up the grunt work, like in games generating textures of the walls or generating a jungle, but the story, the characters, ... will be produced by people. Not because AI can't do it, but because they will want to have copyright protection and be able to build on potential success. I believe this is going to be true across the board as more and more people make living from their creativity.

[0] https://www.smithsonianmag.com/smart-news/us-copyright-offic...

[1] https://press.princeton.edu/books/paperback/9780691134031/ta...


The outcome of the case you linked is that the AI cannot be the author, not that works created using AI can’t be copyrighted.


That's the correct legal interpretation. What that means in simplified lingo is that AI can't author copyrighted material, meaning that AI tools that aids to creation is ok but AI producing creation is not.

The outstanding question is where is the limit, but that will require a few court cases.


I don’t think it’s a far leap to consider AI a tool, much like a digital camera. In the case of the camera, the human provides inputs and conditions, then the camera produces a digital image. In the case of AI, the human provides inputs and conditions, then the AI produces a digital image.

The case in question was a PR stunt intended to get the AI recognized as the author in order to build clout for the company that created the AI. “Look, a judge determined that our AI is so advanced that it can be the author of works of art, and is therefore a person.” It’s similar to stunts that try to get courts to recognize god as real or not real.

The fundamental question was not whether the work could be copyrighted or not - only whether the AI could be listed as the author of the work. It would be similar to if I tried to register a copyright with my digital camera listed as the author of the work. A camera is not a person, so it cannot be the author.


This case is not a PR stunt. This is a test case that was purposely picked to test the law. This is quite common to find boundaries of any law and see if there is an opening for a commercial success.

This is a landmark case that will be referenced for quite some time and have significant impact on future cases around production with the aid of tools.

I like your analogy on the digital camera, but this is really not limited to this one case. For instance the case of monkey's selfie [0] is quite well known and touches on similar questions. This is obviously not a tool, nor object but it's also not a human.

We will have to find out where the boundaries lie between the tool is an aid and the tool is the source of creativity.

[0] https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...


How much authorship is needed to claim copyright though? If the creation is made by a combination of AI and human, then there must be a certain point where it becomes copyrightable, a fact that many AI art generation services will exploit.


The case in question was a PR stunt intended to get the AI recognized as the author in order to build clout for the company that created the AI. “Look, a judge determined that our AI is so advanced that it can be the author of works of art, and is therefore a person.” It’s similar to stunts that try to get courts to recognize god as real or not real.

The fundamental question was not whether the work could be copyrighted or not - only whether the AI could be listed as the author of the work. It would be similar to if I tried to register a copyright with my digital camera listed as the author of the work. A camera is not a person, so it cannot be the author.


It's a good question, one that has not been answered yet. We will have to wait for some court cases to have this answer.


Are you worried about it replacing you? Can't you use it as a new tool somehow?


It doesn't seem to be very good at ears.


I know an interpreter who says the same about Google Translate...


That the refuge of commercial art is going to shrink?


oh no new tools


    Before the vibe shift, all the cool websites looked like this:
Here he shows the sites of Path, Medium, Circa, Squarespace and AirBnB

    After the vibe shift, websites suddenly started looking very different:
Here he shows Intercom, Basecamp, LinkedIn, Lyft, Slack and Notion.

Wouldn't it make more sense to compare the same sites?


I just had a look. Circa and Path are now both defunct, but the remainder (SquareSpace and AirBnB) have kept their pre-vibes look. I must admit they look a bit bland because of it.


Good read, but I can’t shake the feeling that the author ascribes too much to exactly one “vibe shift” in website design and unsplash’s role in it.

Web design trends change every few years, for the same signalling reasons that fashion does. It’s about standing out and leaving an impression more than the cost of technology, which is why fashions can repeat.

I don’t think DALL-E will change that in the way author supposes, it will reduce the cost slightly but consumers can’t judge cost only novelty and there are plenty of ways to be creative with DALL-E.


To be more precise: the author’s theory predicts that new technology moves design away from the things it makes cheap directly as a status symbol. In this case, DALL-E makes websites move away from images it can create.

I contend novelty is the prime mover and technology that opens new possibilities moves designs towards these. Only once the market is saturated with them is new novelty sought. In this case, DALL-E will let websites move towards new styles of e.g. smoothly animating / shifting logos and images, images that are slightly but smoothly different every time you visit the site, regularly styles logos ala Google Doodles, websites with automatic dark/light modes and images to match, websites that match the weather in your location etc. Only after whatever new style is found becomes widespread will it become passé.

Testable predictions: let’s go!


I've been doing web design long enough to have experienced the swing from illustration to photography, and then back to illustration. I don't really think it has to do with signaling wealth or status. Here's what I think: Most web designers are not designers. Most people on web design teams, in hierarchies of management, who style themselves "art directors" or who interface with the clients, are not designers. Most directives that come at people building websites come from marketing teams at companies, composed entirely of people who are not designers. Those people get to work in the morning with one goal: To impress their boss at the corporation by collecting tons of screenshots of other websites and marketing campaigns that have recently been popular, collating them and listing the features that these non-designers think, in their non-design opinion, made those things successful. What you end up with is a feedback loop where every marketing department follows every other one, and demands a copy of the latest trend, without any regard for the company's actual brand. Which designers are then obliged to do, even if we know it's stupid.

The longest running brands I represent use a mix of illustration and photography, and that mix is specific to the way they choose to present themselves. In fact, understanding that particular mix for each business, creating style guides that delineate when one thing or the other should be used, and not messing with the mix if it's working is crucial to maintaining consistency and building brand recognition over time.

TL;DR, most web designers have no design background, and companies hand too much control over online branding to people who are competent coders but should not be responsible for image.

Also, the idea that Dall-E2 or any further iteration is going to replace either photography or illustration at the high end is counterintuitive. The high end is going to just go further to show it's not AI-generated.


> most web designers have no design background

Most people with a design background aren’t particularly good at design, right? And even good designers have styles that go out of fashion and become irrelevant. The requirements for most websites don’t depend on good original design. To your point, web design is mostly curated trends, and not original design at all.

This means two things: 1) most web companies don’t actually need people with design backgrounds, and 2) AI based design could plausibly occupy a place of permanent utility from now on, if what people really need is remixes of recent ideas -- which is what human designers have already been doing, by and large.

I think you’re absolutely right, high end will work to demonstrate non-automation and new concepts. That’s already been the case in the art and design world, some people already work to eschew generative styles as well as easily reproduced art. Photography has never had the cachet of oil painting or sculpture. Illustration is further down the list, and all types of “computer generated” art further still, regardless of how good they are or how much work they take. Unfortunately, doing wholly original design work is expensive, and it means that the people who can imitate good design for the masses cheaply and quickly enough to keep up with trends will have a business advantage over firms bending over backwards to prove they’re human (and most consumers will remain oblivious).


> The high end is going to just go further to show it's not AI-generated.

But I think it'll actually be pretty hard to make something that is clearly not AI generated. Remember that to be effective, it needs to be clear to a regular person, not just an expert in AI imagery.


They might also do the status signalling differently than through a website design (for example, endorsement by a famous person), and we might actually see more functional websites as a result. Like the Berkshire-Hathaway example.


People misunderstand what illustrators do. They don't take an idea like "teddy bears playing poker" and make a literal image of teddy bears playing poker. They look at the total context of a marketing campaign, a brand, an audience, sub-currents and winking references in culture from the latest internet memes to 14th century masterpieces they can subtly reference, and then they synthesize images that play on these in a way that telling them to do something that crosses 14th C painting with a meme with a teddy bear would not accomplish. They sit and think about the meaning of what you want, and then they bring the big idea to it; how to draw the point so it resonates with everyday people, tickles intellectuals, and fascinates other artists. Every good illustrator I've ever worked with has done this; has surprised me by cleverness.

To illustrate this: A marketing person or even an art director (like me) who replaces their illustrators with DallE-2 is not going to think of all those wonderful esoteric things that illustrators bring to to the table, not even enough to put them into a sentence, let alone to bring them to the degree of quality that close study and a honed lifetime of irony/history/intelligent communication and the skill to use visual language to tell stories that trigger recognition and memory and emotion can bring to the audience.

Illustrators don't just "draw picture of donkey". That is actually the most conceited part of the idea that they can be replaced; and it's entirely on the part of marketing people who, again, have no design background. (But do have a permanent financial interest in automating artwork, which is in tension with their overarching job of improving their employer's communications).

Most people with no design background don't know the value of good design. When they see it, they think it's something that can be mechanically aped. It's hard to assign a dollar value to the exact glint in someone's eye in a photograph, the exact alignment of their smile, the placement of the reflection in the watch on their wrist; does taking time and spending a lot of money for a human to consider each of those visual decisions microscopically really pay off in hard cash for the brand? If the answer from your marketing people is "not really", then your brand isn't going further than a local retail chain.

An AI can never compute those things because the slight shift in a smile will always be perceived subjectively by humans whose entire "software" is built around understanding whether smiles are genuine or not. Or noticing (subconsciously) if reflections are slightly displeasing. And it's not just a matter of having it mathematically accurate, in fact, mathematical accuracy is not what you want in a successful illustration. Maddeningly for the people who want to reduce everything to data, visual "mistakes" are a tool that can be used to register extremely positive emotions when deployed with a full understanding of the human psyche and the local culture, just as much as they register displeasingly when used without that knowledge. So at best a machine can only achieve neutral mediocrity in producing an image. If photorealism or the imitation of any particular style were the key to successful illustration, we could have replaced actual illustrators 50 years ago.


> high end is going to just go further to show it's not AI-generated.

That actually is the argument the author makes. That companies compete to differentiate from what can be easily generated by anyone.


The author does hit a chord. I am personally waiting when I can get my hands on this tech myself so I can use it to generate unique images for my blog posts. Designing them myself is a lot of work and you also have to consider copyright and all that.


Does OpenAI cede copyright to works created by Dall-E? I realize they may encounter some challenges in trying to claim copyright over the images, but will that stop them from trying?


> Does OpenAI cede copyright to works created by Dall-E?

Yes.


>> I think when Unsplash (the free photography website) was founded in 2013 it killed the old vibe by democratizing access to great photography, and thereby ruining its function as a costly status signal.

Disagree with the premise. Great photography (the kind needed for a website anyway) was not costly. I got lots of images for apps and websites from iStockPhoto around that time and it was very inexpensive.


Lost me after crap opening my gosh. It wasn't mysterious. It was Stripe. CSS frameworks around the time and mobile/responsive focus shifts. Throw in boom in developers new to space and JavaScript front-end undergoing a big change and that's what it was. Lots of looks based on influential designs like stripe, apple, etc and how they were being developed.


What is so interesting is that you get something really cool like Dall.E2 and instead of folks being engaged with it, and what it can actually do (which currently is moot as we/I can't play with it and have to accept the curated propaganda from OpenAI) what happens is that folks extrapolate it into a story teller, an animator, a game designer....


I wouldn't call it "curated propaganda". I've seen livestreams of people using it and taking requests of what to feed into it. Checkout @karenxcheng on insatagram.


I would argue against his theory of a vibe-shift in web design post-2015.

I certainly remember during 2007-2013 that there were enormous amounts of sites using illustrations in a very similar way to how they are now. In fact, I would go as far as to say there are way more sites using full-bleed "hero" photographs now than there were then.


GPT-3 and DALL-E 2 are only a glimpse of an upcoming change with greater impact on humanity than the Internet.


Shifts happen in design all the time, it's a stampede thing, one thing can start it, the rest follow suit, DalleE may or may not, and then in a few more years something else will happen. DallE is not hugely important, maybe AI in general might be for the next few years.


An alternative hypothesis:

School Will Never End: On Infantilization in Digital Environments

http://sigwait.tk/~alex/doc/bunz%2Cmercedes__school-will-nev...


Thanks for that link!


I am skeptical of any art created without a significant amount of struggle. Maybe it is this strange fixation on hard work = quality mindset I have, but whatever it may be, I can't quite bring myself to use GitHub copilot and the rest of these tools.


This change was more due to the iOS 7 flat redesign, than to what the author proposes.

A design trend, where it became fashionable to have a flat design supported by illustration/iconography, further away from skeumorphism, which photography was more close to.


The best bit in the article is the Berkshire Hathaway website, a case of "the emperor wears no clothes and he knows it".


Twitter Blue subscribers can let Twitter verify their ownership of an NFT profile picture, and then their profile picture comes to be distinguished by a hexagonal frame and a badge indicating which NFT collection it belongs to.

If AI does truly make generation of design of all styles costless, NFTs could act as the "Costly Vibe Signals" the article refers to.


I’ve been thinking about how NFTs are different from ordinary “costly” signals.

NFTs directly come with a price attached. An attached dollar value is the problem.

The difference between receiving an expensive ‘looking’ gift and and a gift with an expensive price tag attached.

Maybe I’m wrong, but being subtle about the effort/cost is a big part of these signalling. NFT skips all of that.


That reminds me of an article I read recently about New York's high-end nightlife scene, where wealthy men engage in an indirect form of purchase of attention from young attractive women, but obfuscated so that it does not appear transactional. There could be something to that.


Don't suppose you have a link or can recall what it was published in? Sounds like an interesting read


I did search for the article to provide a link in the comment, but I wasn't able to find it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: