You would compose a series of local commits and, if you wanted, test them individually before pushing them. With a bunch of changes made to your files locally, you'd use tools like `git add -i` or `git add -p` to stage subsets of your changes to make those commits. As you finish building these commits, you would be left with a series of commits to your local branch, and no additional unstaged or uncommitted changes. You're "draining" the uncommitted changes into commits, part by part. Commands that manipulate the index are how you describe what you want to do to Git.
"Index" is probably not a helpful term. I think of them as simply "staged changes", that is, changes that will be committed to the repository when I run `git commit`, as distinct from local changes that will not be committed when I run `git commit`. With a Git repository checked out, just editing a file locally will not cause it to be included in a commit made with `git commit`. Rather, `git add` is how you describe that you want a change to be included in the "staged changes" that will be committed. You can add some files and not others, or even parts of a file and not other parts.
The need for this doesn't come up especially often, but it's really helpful when it does. One common case where this can come up is when you've been developing for a while locally, and you realize that your changes pertain to two different logical tasks or commits, and you want to break them up. Maybe one commit is "upgrade these dependencies" and the other is "add feature X". You started upgrading dependencies while building feature X, but the changes are logically unrelated, and now the dependency change is bigger than you expected and deserves to be reviewed on its own.
So with all of these changes in your workspace, you'll stage just the changes for "feature X" or "upgrade dependencies" and then run `git commit`. At this point, maybe you'll move this commit into its own branch or pull request in order to code review and ship it separately. (You might use `git stash` to save the remaining uncommitted changes while you do this.) Then you'll return to the remaining changes, which you will stage and commit as well. You've just gone from a bunch of unstructured, conflated changes to two or more separate commits on different branches/PRs (if that's what you want), that can be reviewed and shipped independently. You've gone from a massive change that's too big to review, to multiple bite-sized pieces.
These tools are also especially helpful if you, for any reason, need to manipulate source control history, such as breaking up one already-made commit into several commits, or simply modifying an existing commit. To do this, you would take that commit, apply it to the local workspace as if it's an unstaged change, and then, starting from the point in history before that commit was made, stage parts of the changes again and check them in. At this point, you can push the changes as a new branch, or even rewrite history by replacing your existing branch.
To give a use-case for this last capability, imagine that a developer accidentally checks in sensitive data as part of a big commit. Before (or even after) shipping the change, you realize this, so you want to go back and edit that commit, to remove the part of the change that checked in data while leaving the rest of the changes. You would describe these manipulations with the index as described in the previous paragraph.
You seem to miss the point that instead of staging changes out of your working directory and then committing them, you could just commit those changes out of your working directory. The extra step of staging is not needed.
Everywhere you said stage you could say commit (or amend) and then not need the extra step of committing afterwards.
> You seem to miss the point that instead of staging changes out of your working directory and then committing them, you could just commit those changes out of your working directory. The extra step of staging is not needed.
How would you describe this? One massive `git commit` with a ton of parameters? I don't see how it could work.
How would you describe "commit the first hunk of fileA (but not the second), and the second hunk of fileB (but not the first), and all of file C?". How do you "just commit those changes"? I believe you are missing how to actually describe this on the command line or with an API.
The index is absolutely needed. It's what allows you to build up a commit through a series of small, mutating commands like `git add fileC`, `git add -p fileA`. The value of the index is that you can build up your pending commit incrementally, while displaying what you've got with `git status`, then adding to it or removing from it.
You can build a commit using multiple small "commit --amend --patch" commands. These use the index, but only in a fleeting, ephemeral way; changes go into the index and then immediately into a commit. They go into a new commit, or if you use --amend, into the existing top-most commit.
The index is a varnish onion.
Git has too many kinds of objects in its model which are all bags of files.
I do not require a staging area that is neither commit, nor tree.
Look at GIMP. In GIMP, some layer operations get staged: you get a temporary "floating layer". This gets commited with an "anchor" operation, IIRC. But it's the same kind of thing: a layer. It's not a "staging frame" or whatever, with its own toolbox menu of operations.
$ git commit --patch
... pick out changes: commit 1 ...
$ git commit --patch
... pick out changes: commit 2 ...
$ git commit --amend --patch
... pick out more changes into commit 2 ...
$ git commit --patch
... pick out changes: commit 3 ...
There, now we have three commits with different changes and were never aware of any "index"; it was just used temporarily within the commit operation.
Oops, the last two should have been one! --> git rebase -i HEAD~2, then squash them.
The index is too fragile. You could spend 20 minutes carving out very specific changes to stage into the index. And then you do something wrong and that staging work is suddenly gone. Because it's not a commit, it's not in the reflog.
You want changes in proper commit objects as early as possible, then massage with interactive rebase, and ship.
Suppose HEAD points to a huge commit we would like to break up. One simple way:
$ git reset HEAD^
Now the commit is gone and the changes are local. Then just do the above procedure: commit --patch, etc.
>Rather, `git add` is how you describe that you want a change to be included in the "staged changes" that will be committed. You can add some files and not others, or even parts of a file and not other parts.
That's largely outdated. For years now, git's commit command has been able to stage changes and squirrel them into the commit in (apparently to the user) one operation.
Only people who learned git ten years ago (and then stopped) still do "git add -p" and then a separate "git commit" instead of just "git commit --patch" and "git commit --amend --patch" which achieve the same thing.
"Index" is probably not a helpful term. I think of them as simply "staged changes", that is, changes that will be committed to the repository when I run `git commit`, as distinct from local changes that will not be committed when I run `git commit`. With a Git repository checked out, just editing a file locally will not cause it to be included in a commit made with `git commit`. Rather, `git add` is how you describe that you want a change to be included in the "staged changes" that will be committed. You can add some files and not others, or even parts of a file and not other parts.
The need for this doesn't come up especially often, but it's really helpful when it does. One common case where this can come up is when you've been developing for a while locally, and you realize that your changes pertain to two different logical tasks or commits, and you want to break them up. Maybe one commit is "upgrade these dependencies" and the other is "add feature X". You started upgrading dependencies while building feature X, but the changes are logically unrelated, and now the dependency change is bigger than you expected and deserves to be reviewed on its own.
So with all of these changes in your workspace, you'll stage just the changes for "feature X" or "upgrade dependencies" and then run `git commit`. At this point, maybe you'll move this commit into its own branch or pull request in order to code review and ship it separately. (You might use `git stash` to save the remaining uncommitted changes while you do this.) Then you'll return to the remaining changes, which you will stage and commit as well. You've just gone from a bunch of unstructured, conflated changes to two or more separate commits on different branches/PRs (if that's what you want), that can be reviewed and shipped independently. You've gone from a massive change that's too big to review, to multiple bite-sized pieces.
These tools are also especially helpful if you, for any reason, need to manipulate source control history, such as breaking up one already-made commit into several commits, or simply modifying an existing commit. To do this, you would take that commit, apply it to the local workspace as if it's an unstaged change, and then, starting from the point in history before that commit was made, stage parts of the changes again and check them in. At this point, you can push the changes as a new branch, or even rewrite history by replacing your existing branch.
To give a use-case for this last capability, imagine that a developer accidentally checks in sensitive data as part of a big commit. Before (or even after) shipping the change, you realize this, so you want to go back and edit that commit, to remove the part of the change that checked in data while leaving the rest of the changes. You would describe these manipulations with the index as described in the previous paragraph.