I've been using git since 2007, this only dawned on me last year.
Git is especially prone to the sort of confusion where all the experts you know use it in slightly different ways so the culture is to just wing it until you're your own unique kind of wizard who can't tie his shoes because he favors sandals anyhow.
I like this comment. Over the years I've always found that whenever I see others using git, everyone uses it in different way and even for different purposes. This has left me really confused about what is the standard practice of Git exactly.
This is because people have different needs, Git is trying to cover too many things, there are multiple ways to achieve goals and therefore there is no standard practice. There is no single, standard way to cook chicken breast and that is a way simpler thing.
The solution is to set team/department standards inside companies or use whatever you need as a single contributor. I saw attempts to standardize across a company that is quite decentralized and it failed every time.
> there are multiple ways to achieve goals and therefore there is no standard practice
This is ultimately where, and why, github succeeded. It's not that it was free for open source. It's that it ironed out lots of kinks in a common group flow.
Git is a cultural miracle, and maybe it wouldn't have got its early traction if it had been overly prescriptive or proscriptive, but more focus on those workflows earlier on would have changed history.
This sort of thing is part of the problem. If it takes reading such a long manual to understand how to properly use Git, it's no wonder everyone's workflow is different.
I don't see it as a problem that everyone's workflow is different, and, separately, I don't see it as a problem that it takes reading such a long manual to understand all the possibilities of Git. There is no royal road to geometry. Pro Git is a lot shorter than the textbook I learned calculus from.
Unlike calculus, though, you can learn enough about Git to use it usefully in ten minutes. Maybe this sets people up for disappointment when they find out that afterwards their progress isn't that fast.
Agreed. Back when I first came across git in 2009 I had to re-read the porcelain manual 3 times before I really got it, but then the conceptual understanding has been useful ever since. I have often the guy explaining git to newbies on my team.
Agreed. I'd read the manual if there was something I needed from it, but everything is working fine. Yeah I might've rsynced between some local folders once or twice when I could've used git, maybe that was an inelegant approach, but the marginal cost of that blunder was... about as much time I've spent in this thread so whatever.
The nice thing about knowing more about git is that it unlocks another dimension in editing code. It’s a very powerful version of undo-redo, aka time travelling. Then you start to think in term of changes and patches.
Ane example of that is the suckless philosophy where extra features comes as patches and diff.
With previous version-control systems, such as SVN and CVS, I found that pair programming helps a great deal with this problem. I started using Git after my last pair-programming gig, unfortunately, but I imagine it would help there too.
(I started using Git in 02009, with networking strictly over ssh and, for pulls, HTTP.)
> all the experts you know use it in slightly different ways
What? Knowing that a git repo is just a folder is nowhere near "expert" level. That's basic knowledge, just like knowing that the commits are nodes of a DAG. Sadly, most git users have no idea how the tool works. It's a strange situation, it'd be like if a majority of drivers didn't know how to change gears.
> It's a strange situation, it'd be like if a majority of drivers didn't know how to change gears.
If you literally can't change gears then your choices are a) go nowhere (neutral), b) burn out your clutch (higher gears), or c) burn out your engine (1st gear). All are bad things. Even having an expert come along to put you in the correct gear once, twice, or even ten times won't improve things.
If a programmer doesn't know that git is a folder or that the commits are nodes of a DAG, nothing bad will happen in the short term. And if they have a git expert who can get them unstuck say, five times total, they can probably make it to the end of their career without having to learn those two details of git.
It's an analogy, there's no need to analyze it literally. And no, I've worked with some devs who don't understand git (thankfully I don't anymore) and it was quite a bit more than "five times" they got stuck or messed up the repo on the remote in an annoying way. Sure, if you regularly write code using a bunch of evals or gotos "nothing bad will happen" but it's a very suboptimal way of doing things.
"Expert level knowledge" implies something more to me than simply few people knowing about it. It's ridiculous to say that knowing how to change gears makes you an expert driver, even if a minority know how to do it (such as in the US e.g.)
My point is only that the understanding is uneven. I'm ready to debate the merits of subtrees vs submodules but I didn't know the folder thing. Am I weird? Yes, but here is a place where weird is commonplace.
> just like knowing that the commits are nodes of a DAG
Hello gatekeeping! I have used Git for more than 10 years. I could not explain all of the ins-and-outs of commits, especially that they are "nodes of a DAG". I do just fine, and Git is wonderful to me. Another related example: I would say that 90%+ of .NET and Java users don't intimately understand their virtual machine that runs their code. Hot take: That is fine in 2025; they are still very productive and add lots of value.
"Intimately understand the VM" is not the same as knowing what data structure you're using. It'd be comparable to not knowing the difference between an array and a linked list. Sure you may call it gatekeeping but likewise I may call your style willful ignorance of the basics of the tools you're using. Have you never used rebase or cherry-pick?
In the past I've blown coworkers minds during github outages when I just pulled code from a co-worker's machine and kept working
With remote, if your company stubbornly refuses to use a modern vpn like tailscale, and you can't really network between two computers easily, git format patch and git am, coupled with something like slack messages, works well enough, albeit moderately cumbersome
Yes. I've been subject to claims that a single person can't start a project unless and until an official, centralized repo is setup for them. I've responded with "git init is all that is necessary to get started", but they wouldn't hear it.
Depends, what's the reasoning? Because technically anyone can start a project even without Git. Or even without a computer. Someone can use a pen to write code on a paper.
Depends on what you mean by "a project". If it's policy related, maybe it's company's policy that all code that is written must be stored in a certain way for multitude of reasons.
They don't have a reason. There's no policy that keeps them from doing this. Sure, the whole point is to ultimately have the code in a common place where backups and code review can happen, but if it's a matter of starting something sooner because it takes a few days for the request to flow through to get things set up, they are not constrained by that AT ALL. They can create a git repo with git init immediately, start working, and once the repo is set up in the common area, git push all their work into it. Rather than train people on this, we spend time trying to hasten the common area repo setup time and put additional unneeded burden on the team responsible for that.
It doesn't matter. They are centralized on servers that are ssh accessible, creating it is effectively mkdir and git init.
It's not about how long the action takes, it's about how much the team responsible for that is loaded and can prioritize things. Every team needs more round tuits. Anyone who works in an IT support role knows this. The point is that they can self-service immediately and there is no actual dependency to start writing code and using revision control, but people will trot out any excuse.
But why can't the teams themselves do it? All places I've seen or been to have had teams able to create their own repositories, either they use cloud Git providers like Bitbucket, Gitlab or Github, or they have self hosted Gitlab, Github etc.
Lots of places (unfortunately) restrict repo creation, or CI pipeline creation. The platform team might need to spin up the standard stack for your project, VON access added for AWS environments etc etc. In the sorts of orgs where this happens doing it properly is more important than getting started.
Tons of people never even touch git cli, they use some gui frontend/IDE.
Tons of people who DO use git cli don't know git init. Their whole life was create a project on github and clone it. Anyway initting new project isn't the most "basic" thing with git, it is used less than .01% of total git commands
if you combine the above easily MOST people have no idea about git init
You must work at Microsoft? A pound of paperwork for every new repo really shuts down experimental side projects. I showed my colleagues that we can share code via ssh or (painfully) one-drive anytime instead. They reacted like I was asking them to smoke crack behind the dumpsters. “That’s dangerous, gonna get in trouble, no way bro”
Please, elaborate. I can share my screen with coworkers and talk about all sorts of confidential things, and I can even give them full remote access to control everything if I wished. So why would pushing a some plain text code directly to their machine be so fundamentally different than all the other means of passing bits between our machines?
If you share screen you are in control of what you show, if you give someone SSH access, what would stop them from passing/running a small script to fetch everything you have or doing w/e with your computer? I mean it's a blatant security violation to me. Just no reason to do that.
In large corps you usually have policies to not leave your laptop unattended logged in, in the office, that would be potentially even worse than that.
There is nothing inherently special about code, than say, a confidential marketing deck or sales plan. If they can go a network drive, or a service like One Drive , why can't we put our code there? I'm not talking about the Xbox firmware or the entire Windows source. This is about little one-off projects, highly specialized tooling, or experimental proof-of-concepts that are blocked by bureaucracy.
It's a misguided policy that hurts morale and leaves a tremendous amount of productivity and value on the floor. And I suspect that many of the policies are in place simply because a number of the rule makers aren't aware of how easy it to share the code. Look how many in this thread alone weren't aware of inherent distributability of git repositories, and presumably they're developers. You really think some aging career dev ops that worked at Microsoft for 30 years is going to make sensical policies about some software that was shunned and forbidden only a decade ago?
I mean, it works fine for a few days or weeks, but then it gets corrupted. Doesn't matter if you use Dropbox, Google Drive, OneDrive, whatever.
It's apparently something to do with the many hundreds of file operations git does in a basic operation, and somehow none of the sync implementations can quite handle it all 100.0000% correctly. I'm personally mystified as to why not, but can attest from personal experience (as many people can) that it will get corrupted. I've heard theories that somehow the file operations get applied out of order somewhere in the pipeline.
Among other reasons, the sync engines of these cloud stores all have "conflict detection" algorithms when multiple machines touch the same files, and while a bare repo avoids conflicts in the worktree by not having a worktree, there are still a lot of files that get touched in every git push: the refs, some of the pack files, etc.
When it is a file conflict the sync engines will often drop multiple copies with names like "file (1)" and "file (2)" and so forth. It's sometimes possible to surgically fix a git repo in that state by figuring out which files need to be "file" or "file (1)" or "file (2)" or whatever, but it is not fun.
In theory, a loose objects-only bare repo with `git gc` disabled is more append-only and might be useful in file sync engines like that, but in practice a loose-objects-only bare repo with no `git gc` is not a great experience and certainly not recommended. It's probably better to use something like `git bundle` files in a sync engine context to avoid conflicts. I wonder if anyone has built a useful automation for that.
With bare repos? I was bit by this a few years ago when work switched to "everything on OneDrive" and it seemed fine until one day it wasn't. Following that I did tests with Dropbox and iCloud to find that all could corrupt a regular repo very easily. In the past few months though I've been trying it again with bare repos on iCloud and not had an issue... yet.
I haven't tried it, but I think it's fine if only one person has write access to any given clone. You can pull back and forth between clones freely. It's if you have two Git clients trying to write to the same repo that you'll have problems.
I've put private working copies on NFS and CIFS. NFS worked pretty well (which probably speaks as much to the admins as the tech). Samba mounts on the other hand had all sorts of problems with time stamps that confused not only git, but the build system as well.
Shared write access to the same git repo directory can be done sanely, but you have to get a number of things right (same group for all users, everything group writable, sticky bit on directories, set config core.sharedRepository=group): https://stackoverflow.com/a/29646155
Yes, when you're not on NFS. Maybe it works on NFS but I wouldn't bet my project on it. Locking reliably on NFS is easy to get wrong, and it's a comparatively little-tested scenario now compared to 30 years ago. (You'll notice that the question doesn't even mention the possibility of NFS.) Fortunately at least with Git it's easy to have lots of backups.
That makes me a little less suspicious, and of course the Git developers are well aware of things like concurrency problems and filesystem limitations. But I'd still regard it as an area to be wary of. But only if two clients might be writing to the same repo concurrently.
I read an article not long ago where students coming out of a web dev bootcamp were unable to make a hello world html file and open it in their browser.
We’ve gone so far with elaborate environments and sets to make it easy to learn more advanced things, that many people never learn the very basics. I see this as a real problem.
If somebody asked me if it's possible to scp my git repo over to another box and use it there or vice versa, I would have said, yes, that is possible. Although I would've felt uneasy doing that.
If somebody asked me if git clone ssh:// ... would definitely work, I wouldn't have known out of the gate, although I would have thought it would be neat if it did and maybe it does. I may have thought that maybe there must be some sort of git server process running that would handle it, although it's plausible that it would be possible to just do a script that would handle it from the client side.
And finally, I would've never thought really to necessarily try it out like that, since I've always been using Github, Bitbucket, etc. I have thought of those as permanent, while any boxes I have could be temporary, so not a place where I'd want to store something as important to be under source control.
You’ve always used GitHub but never known it could work over ssh? Isn’t it the default method of cloning when you’re signed in and working on your own repository…?
I have used SSH for GitHub of course, but the thought that I could also use it from any random machine to machine never occurred to me. And when it might occur to me, I would have thought that maybe SSH is used as a mechanism for authentication, but it might still require some further specialized server due to some unknown protocols of mine. I always thought of SSH or HTTPS as means of authentication and talking to the git server rather than the thing that processes cloning.
E.g. maybe the host would have to have something like apt install git-server installed there for it to work. Maybe it wouldn't be available by default.
I do know however that all info required for git in general is available in the directory itself.
Yes, SSH is used as a mechanism for authentication, and it still requires some further specialized server due to some protocols you don't know about. The server is git-upload-pack or git-receive-pack, depending on whether you're pulling or pushing. Your inference that a Linux distribution could conceivably put these in a separate package does seem reasonable, since for example git-gui is a separate package in Debian. I don't know of any distros that do in fact package Git that way, but they certainly could.
IT support and cybersecurity teams responsible for blocking and enforcing network access restriction to "github.com" ... blocked a user request to install "git" locally citing the former policy. The organization in question does IT services, software development, and maintenance as their main business.
There's an interview with Torvalds where he states that his daughter told him that in the computer lab at her college Linus is more known for Git than Linux
Even someone who knows that git isn't GitHub might not be aware that ssh is enough to use git remotely. That's actually the case for me! I'm a HUGE fan of git, I mildly dislike GitHub, and I never knew that ssh was enough to push to a remote repo. Like, how does it even work, I don't need a server? I suspect this is due to my poor understanding of ssh, not my poor understand of git.
Git is distributed, meaning every copy is isolated and does not depends on other's copy. Adding remotes to an instance is mostly giving a name to an URL(URI?) for the fetch, pull, push operation, which exchange commits. As Commits are immutable and forms a chain, it's easy to know when two nodes diverge and conflict resolution can take place.
From the git-fetch(1) manual page:
> Git supports ssh, git, http, and https protocols (in addition, ftp and ftps can be used for fetching, but this is inefficient and deprecated; do not use them).
You only need access to the other node repo information. There's no server. You can also use a simple path and store the other repo on drive.
Doable. It would basically be ssh but without encryption. You'd have to bang out login by hand, but "username\npassword\n" will probably work, might need a sleep inbetween, and of course you'll have to detect successful login too. Oh, and every 0xff byte will have to be escaped with another 0xff
At that point, may as well support raw serial too.
Supporting rlogin on the other hand is probably as simple as GIT_SSH=rlogin
The server is git itself, it contains two commands called git-receive-pack and git-upload-pack that it starts through ssh and communicate through stdin/out
You can think of the ssh://server/folder as a normal /folder. ssh provides auth and encryption to a remote hosted folder but you can forget about it for the purpose of understanding the nature of git model.
SSH is just a transport to get access to the repo information. The particular implementation does not matter (I think). Its configuration is orthogonal to git.
You do, an SSH server needs to be running on the remote if you want to ssh into it, using your ssh client - the `ssh` command on your laptop. It's just not a http server is all.
You start that server using the `sshd` [systemd] service. On VPSs it's enabled by default.
Git supports both http and ssh as the "transport method". So, you can use either. Browsers OTOH only support http.
Edit: hey this is really exciting. For a long time one of the reasons I've loved git (not GitHub) is the elegance of being a piece of software which is decentralized and actually works well. But I'd never actually used the decentralized aspect of it, I've always had a local repo and then defaulted to use GitHub, bitbucket or whatever instead, because I always thought I'd need to install some "git daemon" in order to achieve this and I couldn't be bothered. But now, this is so much more powerful. Linus Torvalds best programmer alive, change my mind.
BTW, a nice example of this general concept is Emacs' TRAMP mode. This is a mode where you can open and manipulate files (and other things) on remote systems simply by typing a remote path in Emacs. Emacs will then simply run ssh/scp to expose or modify the contents of those files, and of course to run any required commands, such as deleting a file.
I mean, you do need an ssh server. Basically ssh can run commands on the remote machine. Most commonly the command would be a shell, but it can also be git commands.
I always found these sort of categorical denigrations to be off base. If most people do think git = github then that's because they were taught it by somebody. A lot of somebodies for "most people". Likely by the same somebodies who also come to places like this. It has always taken a village to raise a child and that is just as true for an infant as a junior programmer. But instead we live in a sad world of "why didn't schools teach person x"
Despite your indignation, the observation that most people think GitHub is git is entirely true. Do you point it out when you spot someone having that mistaken belief? I do.
I don't think it has anything to do with being old, this is what happens when your software gets too complex that nobody even knows the very basics. This is highlighted with the documentation, if the software's documentation becomes lengthy enough that it could be an entire book, you're going to be getting these sorts of articles from time to time. Another highlight is bash, that is a huge man page for what should be a shell.
I didn’t know either - or rather, I had never stopped to consider what a server needs to do to expose a git repo.
But more importantly, I’m not sure why I would want to deploy something by pushing changes to the server. In my mental model the repo contains the SOT, and whatever’s running on the server is ephemeral, so I don’t want to mix those two things.
I guess it’s more comfortable than scp-ing individual files for a hotfix, but how does this beat pushing to the SOT, sshing into the server and pulling changes from there?
There's a lot of configuration possible due to the fact that git is decentralized. I have a copy on my computer which is where I do work. Another on a vps for backup. Then one on the app server which only tracks the `prod` branch. The latter is actually bare, but there's a worktree for the app itself. The worktree is updated via a post-receive hook and I deploy change via a simple `git push server prod`
You actually led me into a dive to learn what worktrees are, how bare repos + worktrees behave differently from a regular clone and how a repo behaves when it’s at the receiving end of a push, so thanks for that!
I’ve never worked with decentralized repos, patches and the like. I think it’s a good moment to grab a book and relearn git beyond shallow usage - and I suspect its interface is a bit too leaky to grok it without understanding the way it works under the hood.
Yep, me. I noticed that you sometimes use ssh:// URLs for GitHub, but I figured it was for authentication purposes only, and that once that step was over, some other thing came into play.
What are some fun/creative ways to do GitHub/GitLab style CI/CD with this method? Some kind of entry point script on push that determines what to do next? How could you decide some kind of variables like what the push was for?
Check the docs for the post-receive hook, it does give everything you need. I don't know what you have in mind by "GitHub/Gitlab style", but it's just a shell script, and you can add in as much yaml as you want to feel good about it.
I actually realized this last week, and have yet to try it. Programming for almost fifty years, using Git for thirteen years, and not an idiot (although there are those who would dispute this, including at times my spouse).
I have encountered it in real life many times. These days I try to give jr's extra space to expose their gaps in things I previously assumed were baseline fundamentals - directories and files, tgz/zip files, etc
I never thought about it. If somebody had asked me, yeah. Of course it makes sense. But it's just one of those things where I haven't thought about possibility.
I imagine larger part of the developer community does not, in fact, know that GitHub is not git and one can get everything they need without feeding their code to Microsoft's AI empire. Just another "Embrace, extend, and extinguish"
It's not "so easy to use without understanding it", it's the opposite it has so much unnecessary complexity (on top of a brilliant simple idea btw), that once people learn how to do what they need, they stop trying to learn any more from the pile of weirdness that is git.
Decades from now, git will be looked back at in a similar but worse version of the way SQL often is -- a terrible interface over a wonderful idea.
I don't think that's true. In my experience it takes time, but once it clicks, it clicks. Sure, there is a bunch of weirdness in there as well, but that starts way further than where your typical git user stops.
I don't think git would end up this popular if it didn't allow to be used in a basic way by just memorizing a few commands without having to understand its repository model (however simple) well.
I was aware that it should be possible to interact directly with another repository on another machine (or, heck, in another directory on the same machine), which implies that ssh access is sufficient, but I was unaware of how that would be done.
There's definitely generational loss about the capabilities of git. In the beginning, git was a new version control option for people to explore, now git is the thing you learn to collaborate on github/lab/ea. I remember the Chagon book coming out as the peak between dense technical explainers and paint by numbers how-tos.