DNA seen through the eyes of a coder (2017)

jfarlow · on Jan 25, 2018

As someone building software to design genetic tools, this is actually a pretty good overview. It's analogies are pretty solid, even if they are just analogies. It's a well-described snapshot of the early phase of education where this helps build a grand intuition and relations between disciplines. Just be careful to remember that they're just analogies, and in general can't be relied on to actually discover, invent, design or conclude.

The fun part for everyone here is that the science has improved so far so quickly that at this point that we are starting to be able to improve, implement and otherwise make good on those coding-style capabilities. We can start to write, patch, update, and affect that code - and do so in intelligent and rational ways, rather than the screens set up in the past century.

The next section I would add would be one that actually started to talk about the compiled programs - the protein's themselves. All that code, all those rules, all those heuristics are there to be run through an atomic 3D-printer to produce proteins. And in an ensemble, those proteins form a closed community within a cell, which then form communities in a tissue, which form in communities called an organism. Those compiled proteins actually do the work at the lowest levels but often get a bit of short shrift while we all stare at their source-code in amazement. The source is elegant, but the programs are even more amazing, in my humble opinion.

Here are some 'profiles' of the kinds of economically and socially interesting programs we've started to make based on the above source-code. An 'app-store' of compiled apps if you will: https://serotiny.bio/notes/proteins/

westoncb · on Jan 25, 2018

> The source is elegant, but the programs are even more amazing, in my humble opinion.

Makes me curious about the 'runtime' which makes their impressive execution possible—granting that the analogy has definite limitations and could be misleading. The aspect of a 'runtime' which I see carrying over is that once the 'source' is 'compiled' into proteins, something makes the interesting behavior of those proteins what it is. I think where the analogy could be misleading is that the architecture of programming language runtimes tends to be a single coherent system with definite separation from programs executing within; whereas in the case of proteins, a large part of what provides for their biologically interesting behavior is just their general chemical properties (which are sort of contained within the 'programs' themselves, rather than in a separate external system). But there are probably aspects of the environment into which the proteins are released that systematically influence their behavior in a way that could be abstracted and studied (at which point it might have some loose resemblance in role to language runtimes). Maybe?

mygo · on Jan 26, 2018

Your 'runtime' is physics.

A protein is a chain of concatenated amino acids.

The chain will "ball up" into a 3-dimensional shape.

Its shape = its function.

Why does the balled-up chain function the way that it functions? Because other balled-up protein chains bump up against it in a certain way and suddenly an iron molecule can be contained within the confines of this mega-structure that has formed, and you now have hemoglobin.

Why does the chain ball up the way it does? Can we predict how a protein chain will ball up? Well, protein folding is an NP-Complete problem. Solve it and you will get at least $1M USD, change the world, etc.

- Software Engineer with a B.S. Degree in Biological Science w/ a focus on genetics + minor in Chem.

westoncb · on Jan 26, 2018

But physics is the 'runtime' for everything at some level of interpretation. I point out this other hypothetical system because it's possible for 'higher-level' runtimes to emerge from lower-level physicals rules, and this looks like it might be one of those cases. (To clarify, by 'emerge', I mean another consistent set of behaviors appears which are capable of formal description and which can be viewed as consequences of a lower-level system without actually resembling that system.)

chillacy · on Jan 26, 2018

I had a discussion about this a few days ago. If the current state of DNA programming is done at the physical level, that's like writing your code by doing photolithography on silicon: not very productive. A level up, maybe the Von Neumann architecture for genetic engineering is the world of DNA and the various proteins like transcriptase. When will we come up with the equivalent Structured Programming? Or Object Orientation? Or Operating Systems?

Each represents an abstraction over the others, and increases productivity.

nicwilson · on Jan 26, 2018

The level up from DNA is proteins, the molecules that actually do stuff. These are relatively structured: Primary structure is the amino acid sequence.

Secondary is the structures that from out of the primary sequence: Helices coil, like DNA except except only a single helix not double. Double and triple helices (see keratin and collagen respectively) are more like winding 2 or 3 single helices around themselves, and Sheets: two or more strands that bond to each other either parallel or antiparallel.

Then it starts to get interesting. These secondary structures form domains which are the functional subunit of proteins that actually do (or are in the case of non-enzymes) stuff. These are the equivalent of structured programming. E.g. join an antibody variable domain (the bit that stick to stuff) to an enzyme, inject it into the bloodstream and you get expression on the enzyme wherever the antibody happens to bind (e.g. to a cancer cell).

(Tertiary and quaternary structure refer to the complete protein and to proteins that form a functional unit with other proteins respectively.)

The OO analogy is like the OO analogy for HDLs, the protein _is_ an object. OS is whole organism level.

gilleain · on Jan 26, 2018

Although it doesn't alter the analogy much, there is also the even higher level of 'quinary structure' - although it is disputed:

https://www.ncbi.nlm.nih.gov/pubmed/23943406

jfarlow · on Jan 26, 2018

It's what we do!

https://serotiny.bio/notes/pinecone/ :: http://biologylabs.utah.edu/jorgensen/wayned/ape/

as

C :: assembly

We'd even hope it's a bit like 'basic' rather than C, to the point where motivated non-scientists could even start to use it: https://serotiny.bio/notes/support/tutorials/

westoncb · on Jan 26, 2018

My understanding is we got programmable computers in large part because: 1) a bunch of scientific/engineering advances we were in a position to make were bottlenecked by massive amounts of 'boring' calculations 2) Some of those affected areas related to the war effort in critical ways, so lots of money and resources were poured into solving the problem.

Seems like we're in a similar position with '1' above, in connection with genetic engineering, so demand exists to abstract over and automate the boring stuff. But maybe the problem is significantly harder, or maybe we're just lacking the pressure that would lead to focused/cooperative effort and lots of resources pouring in, like we had in '2'.

jfarlow · on Jan 26, 2018

> lacking the pressure that would lead to focused/cooperative effort and lots of resources pouring in

Until literally a few months ago, hacking at the level of the protein abstraction was relegated to academics. Then Chimeric Antigen Receptors cured children of cancers, and were approved by the FDA. And so in August, Kite got bought by Gilead for $12B. Then in November Gilead bought another class of multi-domain protein biotherapeutics called SynNotch for $0.5B from an 18 month old startup. And Juno got purchased yesterday for $9B again for their work on CARs - multi-domain protein biotherapeutics.

We'd like to think that such cooperative effort and resources are just starting to pour in. But up until very recently, it was not obvious what the commercial value was when working at such an abstraction layer.

westoncb · on Jan 26, 2018

That's interesting to know, thanks. Regarding the 'cooperative' aspect, though—acquisitions might be falling a bit short of the ideal imo ;) But maybe it will be enough.

mygo · on Jan 26, 2018

But I mean, depending on how you look at it, at some level of interpretation everything can be a higher-level runtime.

Yes, protein-protein interactions are a higher-level runtime emerging from lower-level physical runtimes.

But on the same vein, Hacker-news is a higher-level runtime emerging, in part, from protein-protein interactions, emerging from lower-level physical runtimes.

Can we describe the runtime without invoking some notion of purpose nor questioning free will?

westoncb · on Jan 26, 2018

> at some level of interpretation everything can be a higher-level runtime.

I don't see how everything can be considered a higher-level runtime. The concept of an execution environment just doesn't map to everything, unless you want to use a definition of it so general as to make it useless. If someone drops a glass and it shatters, would a higher level system describing the mechanics of glass shattering be a runtime? I don't see it.

> Yes, protein-protein interactions are a higher-level runtime emerging from lower-level physical runtimes.

That's pretty different from what I was describing. See my other comment for more details: https://news.ycombinator.com/item?id=16235639

> ... Hacker-news is a higher-level runtime emerging, in part, from protein-protein interactions, emerging from lower-level physical runtimes.

I think we disagree on the meaning of the word 'runtime'. I would not consider Hacker News to be a runtime. I do agree with your main point there, though, if you just replaced 'Hacker News' in your statement with, say, the Java runtime environment, then yes, I think it is a higher-level system, usefully described as a runtime, which has emerged as a consequence of the laws of physics doing what they always do.

> Can we describe the runtime without invoking some notion of purpose nor questioning free will?

My usage of 'runtime' does not involve free will nor purpose.

mygo · on Jan 26, 2018

> If someone drops a glass and it shatters, would a higher level system describing the mechanics of glass shattering be a runtime? I don't see it.

It can be a runtime, if some interface is built on top of the phenomenon of the shattering of glass.

Because humans.

jfarlow · on Jan 25, 2018

In that regard, the protein programs exist and interact with each other in a runtime environment/OS that is 'physics'. Which is cool.

Especially, because those programs, when run together, along with all sorts of other small molecule/lipid/temperature conditions are sufficient to output, well, me.

westoncb · on Jan 25, 2018

I was somewhat avoiding bringing that up just because, in theory, every physical 'system' in the universe just boils down to physics—and this question gets pretty close to the line between where it's useful to start talking about a higher-level system, versus thinking in terms of chemistry (and/or physics). So it could get confusing.

If there is a higher level system analogous to a runtime-environment which could be abstracted from actual genetic machinery, I would bet the rules of its operation would still be at least partially in terms of chemical rules (in other words, you couldn't abstract over it entirely).

My justification for it is that I think the system is split between the decentralized behavior of proteins and enzymes etc. just following the rules they always follow, combined with the external 'system' which sets up conditions constraining and otherwise directing the proteins. That second system could maybe be higher level, but it's still central to its operation that it interfaces with a 'raw' chemical system. (Just an idea of course—my confidence that it means anything is ~12%).

Edit: another analogy to maybe clarify (or confuse) things. I see the proteins/enzymes etc. which I described as decentralized as operating like a cellular automaton (e.g. Conway's GoL), but then the 'external'/centralized system as like another layer of rules which sort of pushes around clusters of cells in the grid in various significant ways, while all those clusters are still locally just executing the simple automaton rules.

jfarlow · on Jan 25, 2018

Very much. We've actually tried to capture some of those runtime variables within proteins and break down the programs into smaller compossible units. And start to keep track of which conditions are required for particular functionalities of which components. Life has already made a bunch for us to look at.

Ultimately, in our hands, those properties are strongly linked to physical associations, locations within a cell, enzymatic capabilities, and i/o (light, heat, chemical concentrations, force, etc.).

Then you start (re)building little i/o robots from reusable components for use by and within cells that are regulated by their location and associations (in a wet Brownian environment). All by coding using that same genetic source.

jacquesm · on Jan 26, 2018

> Makes me curious about the 'runtime' which makes their impressive execution possible

https://en.wikipedia.org/wiki/Ribosome

yorwba · on Jan 26, 2018

If DNA is the source code and proteins are the compiled programs, the Ribosome is the compiler, not the runtime.

jacquesm · on Jan 26, 2018

It's an analogy. You could also say that DNA is the program, the Ribosome is the interpreter and proteins are the output, which is much closer to the truth.

leoc · on Jan 26, 2018

Reflections on Trusting Trust...

sporkologist · on Jan 26, 2018

Wouldn't the ribosome be the compiler, that translates codes to proteins?

jacquesm · on Jan 26, 2018

hyperpallium · on Jan 26, 2018

How is it that proteins are able to build structures, like microtubules, cell walls and bone?

jfarlow · on Jan 26, 2018

Proteins are 1-dimensional chains of 'lego-like' molecules that fold into atomically-precise 3D tools. There are ~20 types of legos that have properties that differ along a number of axis like hydrophobicity, charge, reactivity, size, flexibility, etc. And those 20, in combination, seem to be enough to build all of the tools life needs to interact with its environment. Most proteins are assemblies of a few hundred such 'legos' (again, of which there are 20 types).

Those 3D structures then, because they're encoded in the 'tape-like' DNA, can actually themselves be modular, rearranged, and this chunk can be lopped off and plopped onto that chunk, or assembled with this chunk over here. When you string a 'firefly-shine' protein next to a 'in-the-nucleus' protein, the firefly-shine protein will now be physically connected to the 'in-the-nucleus' signal, so you get firefly-shine in the nucleus. Now encode all the things proteins can do: bind to things, enzymatically make things, go places, make structures, sense chemicals, light, force, electricity, other atoms, etc. - and you get little multi-part robots made of those various active 'chunks'.

If you have a protein piece that itself assembles in a hexagonal-like shape, you could imagine assembling a number of them into a soccer ball - just as a virus does to protect its DNA.

Or if there are small Brownian-aware 'ratchets' attached to stiff fiber-like proteins, you can get proteins that move or walk: https://www.youtube.com/watch?v=y-uuk4Pr2i8

The microtubule's they're walking over are themselves linear assemblies of proteins: https://www.youtube.com/watch?v=jGmz4xVP50M

Similarly, spider silk is protein: https://boltthreads.com/

And leather is protein: http://provenance.bio/

Here are some more animations. The only thing wrong is how empty the space is. In real life, there is no empty space at all - it's jammed pack with other proteins and molecules:

https://www.youtube.com/watch?v=FzcTgrxMzZk

https://www.youtube.com/watch?v=VdmbpAo9JR4

Bone is a mineral deposit, deposited by proteins within particular cells: https://en.wikipedia.org/wiki/Osteoblast#Osteogenesis

Cell walls or lipid membranes are fats that are generated in the right concentration by proteins, and then allowed to self-assemble, creating well-controlled environmental compartments for sensitive proteins to remain functional.

TeMPOraL · on Jan 26, 2018

> The only thing wrong is how empty the space is. In real life, there is no empty space at all - it's jammed pack with other proteins and molecules

That was a huge revelation for me. I'm reading through The Machinery of Life by David S. Goodsell; his drawings drive this point home very well. It turns out the cells are so densely packed, that only small things can meaningfully move around at all (through diffusion, i.e. by bouncing around) - for larger stuff, there are dedicated mechanisms in the cell that ship big molecules around!

mygo · on Jan 26, 2018

You made this? Awesome I ran into it a while back and thought it was really interesting. Does this mean you're tapped into the synthetic biology networks? Would love to chat about that, as its something I want to get into more due to my biology + programming background.

jfarlow · on Jan 26, 2018

We did, brother & I. We're trying to make useful the abstraction layer above DNA. We think there's much benefit to thinking and placing complexity at the protein level, and back-calculating the DNA, completely offloading the complexity of actually manufacturing DNA.

I'd be happy to chat. Email is my first name at serotiny.bio

-Justin

jfarlow · on Jan 26, 2018

To be clear, I'm not the writer of the original post about DNA as code. I created Serotiny and our protien design software, and was providing my thoughts on the post given my background.

visarga · on Jan 26, 2018

You know what blew my mind about DNA? The "Gene Regulatory Networks". https://en.wikipedia.org/wiki/Gene_regulatory_network

They work analogically like neural networks, where genes act as signals to other genes in a complex graph of dependency. Isn't that amazing? Each cell has a "gene based brain" (an RNN) which can do computation - it can map states to actions. Each neuron in the brain is actually a whole neural network. GRN's are probably the most energy efficient neural nets that exist in the universe, and they're also self replicators.

bmsran · on Jan 26, 2018

"Is here. This not a joke. We can wonder about the license though. Maybe we should ask the walking product of this source: Craig Venter."

The reference build of the human genome (provided here by Ensembl) is almost entirely derived from the public human genome sequencing project, not the private project led by Venter.

chrisamiller · on Jan 26, 2018

And the public reference genome is about 2/3 from a single African American individual from Buffalo, New York!

colemannugent · on Jan 25, 2018

Definitely an interesting take on the "coding" of DNA.

Something that has always interested me is similarities in natural and unnatural mechanisms, like how modern cameras mimic eyes right down to the tiny motors that drive the auto-focus mechanisms.

The long sequence of nucleotides reminds me of a strange sort of Turing machine capable of building humans.

dekhn · on Jan 25, 2018

the big difference between modern cameras and eyes is that a huge amount of neural processing is done in the retina of the eye. There aren't any cameras (that I know of) that compute such sophisticaed levels of processing (edge detecting, contrast detecting, motion detecting) within the sensor package.

kevin_thibedeau · on Jan 26, 2018

There are CMOS sensors with in-array edge detection. More flexibly, all optical mice have integrated sensors that do motion processing in the same package if not the same die.

dandare · on Jan 25, 2018

This is amazing and I was looking for such analysis for a long time. I only wish it included more information about the actual programing language of DNA.

>Now, DNA is not like a computer programming language. It really isn't.

Maybe it would be interesting to talk about how it differs and what programming paradigms do not have analogies in the genetic code.

tehsauce · on Jan 25, 2018

A very important aspect of DNA as code the heavy use of macros. The code itself heavily influences the expression of other parts of the code.

scalio · on Jan 26, 2018

How so?

teekert · on Jan 26, 2018

Very nice piece, as a biologist turned programmer I think it nicely sums things up. Some remarks though...

"Similarly, as an embryo develops in the mother's womb, its DNA is edited substantially to reduce its growth rate, and the size of the placenta. In such a way, the competing interests of the father ('large strong children') and the mother ('survive pregnancy') are balanced. Such 'imprinting' can only happen within the mother, since the father's genome doesn't know anything about the size of the mother."

This is strangely worded, I think there is indeed a trade-off between size of the child and the mother surviving but the father also benefits from the mother surviving (better have breast milk available and love and care, right? Father genome?). Moreover, the fathers' genome is half inherited from a woman and is mingled with a woman's DNA every generation (most of the genome is not specifically male of female). I don't believe there is evidence that the Y chromosome solely drives this push for a larger child in the uterus, but please correct me if I'm wrong.

I also think it is theoretically possible to produce a female child from two fathers by merging their genomes and supplying the child with two X chromosomes and no Y chromosome. So the statement that "the father's genome doesn't know anything about the size of the mother" is also wrong (for the used definition of "knowing"), the male genome "knows" just as much as the mother.

ajuc · on Jan 26, 2018

> So the statement that "the father's genome doesn't know anything about the size of the mother" is also wrong (for the used definition of "knowing"), the male genome "knows" just as much as the mother.

It knows something about possible future female child, but not about that particular mother that will have to give a birth?

Razengan · on Jan 26, 2018

As a layman, I’ve always wondered about something but I’m not sure how exactly to ask/frame the question:

How did biological “software” “evolve?”

As in, the basic features that are common to many macroscopic lifeforms, like knowing one’s position within 3D space, as well as the related-but-distinct sense of proprioception [0].

https://en.wikipedia.org/wiki/Proprioception

chrisamiller · on Jan 26, 2018

The short answer is that it was useful! The parameter space of things that have been tried in uncountable organisms over _billions_ of years is vast.

And proprioception is incredibly useful, even in small doses. Imagine you're a single-celled organism:

At the simplest level, some vague sensory input about your surroundings so that you can avoid the predator/find the food gives you a massive advantage over other organisms that can't. Every random mutation that improves this ability, even a tiny bit, is rewarded, as you can grow faster/die less and ultimately, reproduce more. (this is what we call "fitness" - how many of your genes get passed on).

If you get a mutation that hurts this ability, you're going to be massively penalized - eat less, grow slower, reproduce less.

In pretty short order, the organisms with higher fitness are going to take over.

Mutations occur all the time. In humans, the rate is very low - about 3 new mutations during every cell division. In many bacteria, the fidelity rate is much lower, and the reproduction rate is much higher, and so they are _constantly_ exploring new combinations of parameters. Most are neutral, many are disasterous, a few give a slight advantage.

Pile up these small advantages over millions or billions of years, and gradually, in fits and starts, very complex abilities evolve.

searine · on Jan 25, 2018

This is a fun parallel, but I feel like it fails to go the other direction.

How is DNA different from programming?

This is important to understand because those differences are the foundation of our intuition about how DNA operates. We can't let ourselves fall into the misunderstanding that cells are like computers. In particular, the ideas of random mutation and populations are inherently different from software.

Imagine a piece of code whose bits slowly decay over time. Where functions compete with functions in every other program on the filesystem to see who is most efficient. Where scripts need to constantly copy themselves to other folders simply to maintain their integrity.

Its this kind of stochastic and unreliable environment that I think a lot of people forget about when talking about DNA. Yes it is a consistent heritable genetic library, but it is also in a constant state of change. Genes are strange little islands in a sea of noise.

jfarlow · on Jan 25, 2018

Adding to that conceptual challenge is separating in time the various components of the regulation and manipulation in the above analogies. Low-level programmers have an intuition of the 'time-cost' value of various hardware calls - the various caches, RAM, HD, network, etc. and how they can be many orders of magnitudes different even if the API considers them identically.

Life goes the other way and requires you not only to integrate over nanoseconds, but also over petaseconds. And at most scales in between there are mechanisms of regulation and operation.

Robotbeat · on Jan 25, 2018

> Imagine... Where functions compete with functions in every other program on the filesystem to see who is most efficient. Where scripts need to constantly copy themselves to other folders simply to maintain their integrity.

That sounds a lot like computer viruses.

tzahola · on Jan 25, 2018

Good analogy!

Funny, but in fact there’s even a whole class of self-replicating biomachines named after computer viruses! https://en.m.wikipedia.org/wiki/Introduction_to_viruses

The similarities are eerie.

agumonkey · on Jan 25, 2018

It's not different, but so far no one knows the interpreter semantics.

tvelichkov · on Jan 26, 2018

Great article, so no unit testing? I knew it!

callesgg · on Jan 26, 2018

From what I know genes don’t exist as distinct parts of the genome, genes are just a way that we humans grouped the DNA.

97% is not junk is is just not coding for pure proteines. The machinery can jump in to “commented” genes. We don’t know exactly how non protein coding dna works to the lowest level. Allot of it is probably useless but you probably can’t remove it without destroying stuff.

Compared to a software project the DNA code has the worst code quality you will ever see in a working system.

Almost everything is dependent on everything else. Is is essentially a software project made by a toddler cutting and pasting assembler code during millions of years.

thriftwy · on Jan 26, 2018

Calling it junk didn't make sense to me 10 years ago, and today it is plain false.

DNA before protein sequence is used for binding modifiers. This means that every protein has a huge if() { before them, and many kinds of different stuff can bind there in order to either suppress production of this protein or increase it. This is how all this stuff work.

How would it work otherwise, I always wondered. Will all proteins in DNA be produced at the same rate, as naive models will imply? It turns out, they aren't, and this is regulated by areas in 'junk' DNA.

Complaining about junk in DNA is like complaining about dispatch in computer program. All kinds of ifs and whiles and fors. Everybody knows our programs are 97% dispatch and 3% business logic computations after all. Or worse.

TeMPOraL · on Jan 26, 2018

> Almost everything is dependent on everything else. Is is essentially a software project made by a toddler cutting and pasting assembler code during millions of years.

I prefer to see it as code written by demoscene genius hackers - hyperoptimized, "almost everything is dependent on everything else" - the same opcode may have three different purposes in a program, plus being data to some other code.

Still, I feel we're missing some important part of the picture - our current descriptions make living systems seem much more fragile than they really are. Maybe there's a new model / structural abstraction waiting to be discovered, that would make DNA machinery seem less ad-hoc?

ajuc · on Jan 26, 2018

Saying non-coding DNA is junk is like saying non-printf code is junk :)

ozy · on Jan 25, 2018

I might have missed it, but key is that DNA is not executed linearly. Instead all instructions are executed all the time. Much like the difference in normal programming and programming an FPGA, but then probabilistic.

dsnuh · on Jan 26, 2018

>Furthermore, 97% of your DNA is commented out. DNA is linear and read from start to end.

Looks like he is saying the opposite?

Obi_Juan_Kenobi · on Jan 26, 2018

That statement is only true in a very narrow context.

Much of your DNA is silenced by being formed into dense chromatin - DNA that is tightly packed onto a protein scaffold. However, the regions that are silenced change throughout development, and these regions are not strictly inaccessible. This is seen with CNSes, conserved non-coding sequences. As the name implies, these are not traditional genes that make protein products, but are usually small regulatory sequences that often affect chromatin state, and sometimes at a great distance. I've had colleagues attempt for years to identify a mutation, only to find it that the causative change occurred many kilobases away from the relevant gene. This is partly the reason why DNA is not 'plug and play', as the genomic context of a particular sequence often matters.

Which is why DNA isn't really linear, at least on a genomic scale. The processes of transcription and translation (DNA -> RNA -> Protein) are linear to be sure, but the regulatory networks that determine gene expression are happening all at once on a massively parallel scale. These molecules are also 3-dimensional and can fold back on themselves and cause modifications. Essentially every fundamental process in gene expression is able to be regulated, whether it's enhancers and repressors affecting transcription, RNA-mediated silencing, RNA splicing, RNA modifications and stability, histone modifications, protein modifications, protein stability, phosphylation or any of the dozens of other post-translational regulation, and on and on and on.. And there are thousands of genes which can impact all manner of other genes at any of those levels, either directly or otherwise. It's a wonder that we're able to tease anything out of it that makes sense.

Generally speaking, I caution against computer analogies for biology. Biology is messy! It rarely works how you want it to. Even relatively common and basic techniques in molecular biology require a great deal of troubleshooting, and even in the best labs with loads of experience, sometimes things simply refuse to work how you'd like them to. You don't hear too much about synthetic biology anymore (it's still going, just less hype) because the premise was incredibly naive; you were never going to get bits and pieces of DNA to behave in a predictable manner. Just about everyone with wet lab experience suspected this.

For context, the current hype about CRISPR is almost entirely based on how well it works, not what it does. We've had a few different techniques for modifying DNA for a while now, but none of them worked quite well enough to practically accomplish all the interesting things that CRISPR is now enabling.

tw1010 · on Jan 25, 2018

More like (2002)

WhitneyLand · on Jan 25, 2018

What’s wrong with it? Seems like a fantastic micro introduction to DNA for those with a computer science background.

Of course it’s not perfect. I’d bet most readers like myself were enjoying it, while thinking of analogies that might be more apt or illustrative. But the power of it is how quickly it can establish a frame of reference, and if desired a jump off point for further understanding.

If there are parts so fundamentally flawed or outdated as to negate the value, please, point them out. Of if you know of a better article that illuminates as much given only a 10 minute time investment, by all means please share it.

austinprete · on Jan 25, 2018

I took his comment to reference the fact that this was originally written in 2002 with minor additions over time, as evidenced by the “Updates” section.

tw1010 · on Jan 26, 2018

This was the intent of my comment.

tantalor · on Jan 25, 2018

Yep Wayback Machine places origin in June 2002.

https://web.archive.org/web/20020601182120/https://ds9a.nl/a...

sundarurfriend · on Jan 25, 2018

Yeah, it was written in 2002, and it's unclear if the content was updated after that, so I'd originally left the year unspecified in title.

By the way, the author apparently did a talk on this in Aug 2017, in the presence of biologists, with some updates about new tech like CRISPR. I submitted a link to the blog post about this[1] yesterday, but it didn't gain much traction then (probably due to bad title: "DNA: The Code of Life" just sounds like some basic biology piece).

[1] https://medium.com/@bert.hubert/dna-the-code-of-life-12db4a1...

fjfaase · on Jan 25, 2018

I think that his talks on SHA2017 were the best. Not just with respect to the contents but also for the manner he presented them. A definite watch for anyone who wants to know a little more about DNA.

bitwize · on Jan 25, 2018

Since this was written, we've discovered CRISPR, which started off as a bacterial analogue to antivirus software that works against actual physical viruses.

sundarurfriend · on Jan 25, 2018

The author did a talk on this with updated content, apparently including "A little bit on CRISPR". Blog post with links to the talk videos: https://medium.com/@bert.hubert/dna-the-code-of-life-12db4a1...

tritium · on Jan 26, 2018

  What happens if you copy paste 
  the 'legs selector' part of a 
  mouse HOX gene into the fruitfly 
  Homeobox:

    'In fact, when the mouse Hox-B6 
     gene is inserted in Drosophila, 
     it can substitute for Antennapedia 
     and produce legs in place of 
     antennae'

Mother of God, is that ever strange.

rumcajz · on Jan 26, 2018

On similar topic: http://250bpm.com/blog:89

Maultasche · on Jan 25, 2018

Interesting, that was a really useful analogy.

So what I learned is that our body is like an Actor model with lots of asynchronous processes sending messages to each other. I'm sure that's a leaky analogy, but it makes things simpler in my mind. I have the sudden desire to go emulate it with Elixir.

ajuc · on Jan 26, 2018

My internal model of DNA programming:

- make 1 million sed scripts acting on directory X

- put all of them in directory X

- run all of them in parallel in infinite loops

tzahola · on Jan 25, 2018

Ah, the good ol’ ”imagine gravity like aaaa... sheet of rubber”. Except this time for biology!

dandare · on Jan 26, 2018

So you prefer the "imagine gravity like aaaa curvature of space and time"?

We explain our science in analogies. We think in analogies. And DNA like a programming language is a useful analogy.

chrisamiller · on Jan 26, 2018

It's definitely a good starting place for people from a CS background, but if you really want to learn biology, don't get too attached! As you get deeper, you'll quickly discover the limitations of that analogy.

coldtea · on Jan 26, 2018

Nope, actually a very good analogy, which is miles better than the gravity as a sheet.