Manage and deploy Drupal code securely with Git, gitosis and Capistrano
UPDATE 2: I have written a follow-up article which simply covers the branching technique described in this post.
UPDATE: I am now using git-http-backend instead of Gitosis. I made the switch because I needed a more central location for managing our projects. I have not written about it yet, but this new approach allows me to authenticate via an LDAP/Active Directory type service.
I have quite a few sites set up with Drupal and it has been working beautifully for me. However, with the fast turn over of Drupal code, I was having trouble keeping all of the sites up to date with the most current code releases. I needed to figure out how to manage my Drupal code in a way that allowed me to upgrade easily without stressing about what I might break. After a lot of research and trial and error, I have finally settled on the following setup.
The main motivation for using Git is the painless branching and merging. I am using Capistrano cause it is easy to set up and work with.
This post is meant to be a tutorial of sorts, but is more likely to be uses as a reference when setting up or changing your sites.
What we will do (or need to do):
- Install Git on your local machine
- Install Git on your repository machine
- Install Git on your staging machine (optional)
- Install Git on your production machine
- Install gitosis on your repository machine
- Setup gitosis
- Setup your sites in Git and push to the repository machine
- Install and setup Capistrano
- Deploy your site to a staging machine (optional)
- Deploy your site to a production machine
Get the machines ready to do some real work...
Install Git on your local machine, repository machine and production machine (and the other machines as needed). This is easy enough, so I will not got into detail. If you have problems, you can checkout these articles because I have gone into installation details there.
Install gitosis on your Repository machine
cd ~/src
git clone git://eagain.net/gitosis.git
cd gitosis
python setup.py install
If you get errors on the above line of code, then you will need to install python-setuptools.
Now set up a user on your Repository machine that will be the one who owns the repositories. I use 'git'.
sudo adduser \
--system \
--shell /bin/sh \
--gecos 'git version control' \
--group \
--disabled-password \
--home /home/git \
git
Now you need to have an SSH Key set up on your local machine. If you don't have one, do this (on your local machine)...
ssh-keygen -t rsa
Assuming that you created the file id_rsa.pub, now you need to copy it to your Repository machine. I put mine in /tmp/id_rsa.pub so it is easy to access.
Now initialize gitosis with that key (on the Repository machine).
sudo -H -u git gitosis-init < /tmp/id_rsa.pub
Make sure that post-update is set to executable.
sudo chmod 755 /home/git/repositories/gitosis-admin.git/hooks/post-update
To set up the gitosis admin on your local machine.
(Everything from here on is on your local machine...)
Note: cd into the directory you want your code to live first.
git clone git@YOUR_REPOSITORY_MACHINE:gitosis-admin.git
cd gitosis-admin
Setup the admin for the repositories.
Open the file gitosis.conf and make sure it has the following for the admin setup.
[gitosis]
[group gitosis-admin]
writable = gitosis-admin
members = youruser@machine
(youruser@machine is going to be taken from your id_rsa.pub file. it will be the last piece of that file and will look something like: hzvu4nTtw3Q== youruser@machine)
UPDATE: I have had some problems with this method of specifying a user. What I do now, which seems to work, is make sure that my .pub files have unique names when I create them (eg: __ which could be something like, forwardthinkingdesign_prod_root) and I use the filename in the 'members' section.
Setup and manage your repositories...
Also in gitosis.conf add your project (your_project in this example).
[group yourteam]
members = youruser@machine
writable = your_project
Push this updated configuration to the server.
(you will probably have to add the appropriate files to be committed with 'git add'. when you have more than one project, you will have more files here that will be untracked.)
git commit -a -m "Allow youruser@machine write access to your_project"
git push
Now you are going to actually create your_project and push it to the gitosis repository.
mkdir your_project
cd your_project
git init
git remote add origin git@REPOSITORY_MACHINE:your_project.git
At this point you need to setup the how you are going to manage your Drupal code. You can put it all in one branch with git push origin master:refs/heads/master, but I would NOT recommend this.
The Git setup I recommend for managing Drupal projects.
Create a branch drupal just for the Drupal code. You never make changes to this branch other than upgrading to newer versions of Drupal. Git created a master branch when you did git init and since Git does not like you trying to branch from it without doing any 'real work', we are going to add to the master branch and then rename it to drupal before we add it to the repository machine.
tar xvzf drupal-6.x.tar.gz
rm -rf drupal-6.x.tar.gz
mv drupal-6.x drupal
I also add a .gitignore file to this base drupal branch so I do not have to worry about stuff I do not want getting into my repository.
Create a .gitignore file in the root of your repository and add the following lines to it.
.DS_Store
drupal/sites/default/files/
Capfile
config/deploy.rb
Explanation of ignores:
.DS_Store -> This is because I am on a Mac and it creates these files all over the place. They should not be in my repository.
drupal/sites/default/files/ -> This is because I run a local server on my machine that needs the 'files' directory, but I do not want to track the 'files' directory in my repo. (this may also be drupal/files/ depending on what version of drupal you are on.)
Capfile -> We have not gotten this far, but this is for Capistrano.
config/deploy.rb -> Again, this is for Capistrano.
Now add and commit these changes.
git add .
git commit -a -m "Initial Drupal commit"
Rename the master branch to our drupal branch.
git branch -m master drupal
We now have the drupal branch up to date on our local machine. Lets push it to the repository server.
git push origin drupal:refs/heads/drupal
From the drupal branch we want to create a modules branch that will be used to manage all of the unmodified contributed modules. This should only be the module code that you get from drupal.org. You do not modify any of the code in this branch, just upgrade the modules when needed.
You need to make sure you are on the drupal branch and then you will create a modules branch. Once there you will add all your modules to the modules branch.
git checkout drupal
git checkout -b modules
cd drupal/sites/all
mkdir modules
cd modules
tar xzvf yourmodule.tar.gz
rm -rf yourmodule.tar.gz
... (repeat for all your modules) ...
git add .
git commit -a -m "Initial add of all of my sites modules"
(i also track my themes that are not modified in this branch)
When you are done adding all of your modules then you will want to push the modules branch to the repository machine.
git push origin modules:refs/heads/modules
Now you should have all of the base modules (and themes) tracked in your repository. At this point you want a branch where you can do all your modifications that are custom to your site. I track all my custom site specific modules, custom site specific themes and all of my module and core hacks in this branch. I call this branch production since it is what I push out to my staging and production machines.
Lets create the production branch and add our site specific stuff (including drupal/sites/default/settings.php).
git checkout modules
git checkout -b production
# make all of your changes that are custom to the site.
git add .
git commit -a -m "Initial add of all of my site specific hacks/modules/themes/etc..."
Now lets push this branch to the repository machine.
git push origin production:refs/heads/production
Updating to a new version of Drupal (or updating modules)
This is where the real power of Git comes in and this is why I use this method.
When upgrading the Drupal core, we only want to change the drupal branch because that is where the Drupal core is stored.
Checkout the drupal branch and once there we are going to replace the current version of Drupal with a new version.
git checkout drupal
rm -rf drupal
tar xvzf drupal-6.xx.tar.gz
rm -rf drupal-6.xx.tar.gz
mv drupal-6.xx drupal
Now lets update the local repository and then push it to the repository machine.
git add .
git commit -a -m "Drupal 6.xx update"
git push origin drupal:refs/heads/drupal
We have to propagate this change up through our other branches, so we will pull these changes into the modules branch and then commit and push it to the repository machine.
git checkout modules
git pull . drupal
git push origin modules:refs/heads/modules
And likewise for the production branch.
git checkout production
git pull . modules
git push origin production:refs/heads/production
Now all of your code is up to date on your local machine as well as your repository machine. You can update the modules branch just the same my downloading newer versions of the modules code and just update the modules and then the production branches.
Deploying your code to the staging and production machines
At this point some people may choose to just copy and paste the code from their repository to their staging/production machines. As a simple solution, that is not too bad, but I want a more elegant method for deploying. For this I use Capistrano. Capistrano is built on Ruby and is very popular with Ruby on Rails developers. In order for it to work nicely with PHP and Drupal, we need to create our own config/deploy.rb file. Luckily for you, I have already done this and you can find it attached at the end of this article. Lets get started...
Setup your remote machines with permission to access your repository
In order for you staging and production machines to be allowed to access your repository machine to get the latest code, you need to setup an rsa key for each of them. Do the following on all the machines that need to access the repository (staging and production).
ssh YOUR_USER@REMOTE_MACHINE
ssh-keygen -t rsa -> Be sure to create it with a unique name (example: 'staging')
Copy the staging.pub file from REMOTE_MACHINE to your local machine and place it in the gitosis-admin/keydir folder. You need to edit the gitosis-admin/gitosis.conf file and add the user from your REMOTE_MACHINE in the appropriate section as we did before.
Example (added the bold):
[group yourteam]
members = youruser@machine remoteuser@remotemachine
writable = your_project
Now add, commit and push the changes to the repository machine (needs to be done from inside the gitosis-admin directory).
git add gitosis.conf keydir/staging.pub
git commit -a -m "Gave the staging machine the ability to access the repository"
git push
(Repeat for all the machines that need access to the repository)...
Giving additional users access to your repository
If you need to add someone to the project, set them up with access the same way you just did above for the 'staging' machine. Once they are setup with permissions, they will need to clone the project to have access.
git clone git@REPOSITORY_MACHINE:your_project.git
Notes (start)
If you are on Mac OS X Snow Leopard, it may ask you for a password. This is because Snow Leopard ships with key forwarding disabled by default and you will have to modify the file /etc/ssh_config to get it working.
Change the lines:
# Host *
# ForwardAgent no
To:
Host *
ForwardAgent yes
If it still asks you for a password on Snow Leopard, you may need to add your passphrases to the Apple keychain. Type the following in a terminal.
ssh-add -K ~/.ssh/name_of_your_key
If you still have problems, you may want to add loglevel = DEBUG under the [gitosis] section of your gitosis.conf file to get more information about what is happening.
Permission Denied
This often happens and is a PITA to figure out. One thing that I have found is that once you add a user to the remote repository, they have problems connection to the remote machine to clone the repository. This may resolve the issue (this has always been on a Mac).
Add the following to the file: ~/.ssh/config
Host REPOSITORY_MACHINE
User git
Hostname REPOSITORY_MACHINE
PreferredAuthentications publickey
IdentityFile /path/to/.ssh/filename
(filename is the same as the filename.pub file, but without the .pub at the end)
Notes (end)
Once it clones the project, you will probably get the following error:
Warning: Remote HEAD refers to nonexistent ref, unable to checkout.
This is basically saying "I don't know what branch to checkout".
Check and see what branches you have available:
cd your_project
git branch -a
This should show you:
origin/drupal
origin/modules
origin/production
You will not want to check those branches out directly because they are remote branches and you will detach the head if you do. Basically you want to create local branches that you can change that will be linked to the remote branches.
git checkout -b drupal origin/drupal
git checkout -b modules origin/modules
git checkout -b production origin/production
Now they will have a working local copy of the repository and they will be able to make commits to the remote repository.
Setup Capistrano and server configuration
You need to install Capistrano on your local machine. They have great documentation at the Capistrano website which I reference every time I do this, so I will leave that as an exercise for you to do.
At this point I am assuming that you have Capistrano installed on your local machine. We now need to setup our project to deploy with Capistrano.
In the root of your repository directory (eg: your_project) on your local machine, do this.
mkdir config
capify .
Download the file (deploy.rb) that is attached and replace the deploy.rb file in the config directory. You will have to go through the file and change all the sections that are IN_CAPS to the correct information. I will try to make this painless, but it may take some playing to get it setup for your environment. A lot of this depends on how you want to setup your server, so you may have to change the configuration as needed...
Now that your configuration is correct we are going to do the setup.
Note: You need to setup your staging and production machines the same as you setup in the deploy.rb file. So you need to create the directory that you want to deploy to (in my example it is: /var/www/, and make sure it is owned by the user that you specified in your deploy.rb file).
Now we need to create the skeleton of your deploy file structure.
cap deploy:setup
Make sure that you have everything in place to do the deploy.
cap deploy:check
(If you already have 'files' for your project, you may want to copy them into the /var/www/your_project/shared/files/ directory at this point so the actual deploy command can change the permissions correctly.)
Lets do the deployment...
cap deploy
You have just deployed your code from your repository machine to you staging or production machine. You will have to make sure that your apache configuration is setup correctly to serve the site that is currently deployed. To do that I have attached a sample vhosts file (sample_site.conf) that may get you moving in the right direction.
Congratulations!!! Now you can hack the core to your hearts content and you will still have a clean upgrade path (just do it in your production branch). ;)
References:
- Hosting Git repositories, The Easy (and Secure) Way
- Drupal Development and Deployment using Git
- Installing gitosis
- A Tempest of Thoughts - Capistrano
- Enable SSH Agent (Key) Forwarding on Snow Leopard
| Attachment | Size |
|---|---|
| deploy.rb | 3.16 KB |
| sample_site.conf | 864 bytes |


Comments
Damn
Ok first of all that took awhile to read, and secondly thank you for the timing of this post. I was just bitching to myself that Drupal updates stuff every 30 seconds it seems like, and the more sites I use it on the more pain in the butt it becomes to keep updated across multiple servers. So cheers, and thanks for the awesome timing of this post - as usual.
sorry, you must have a tty to run sudo
I was getting this error on a server and after some research, I found that I needed to add the following line near the top of my
config/deploy.rbfile...default_run_options[:pty] = trueOnce I did that, it worked fine...
Beautiful. But what about the database?
Can you shed some light on how you manage database changes? Or how why you don't need to worry about them?
Gitosis on Repository machine, not local, right?
In the early section "What we will do" you say: "Install gitosis on your local machine". I hope you mean to install it on your repository machine. In fact, yes, now I see that later on you give instuctions on how to install it onto the repository machine. You may want to fix that typo.
Thanks for the great article! It was incredibly helpful.
Shawn
Managing Databases
@Bronius: I have not done anything special for databases. I do not have my databases in the git repository because that would get complicated if I had to manage community generated information in git.
How I deal with the databases is as follows:
Thats basically it...
Great but lost with Capistrano!
Hi, Will. Your tutorial is easy to follow even for a person without experience of these apps like myself.
In my case, right to the point after installing Capistrano. Changing the deploy.rb for my Ubuntu Lucid workstation and a site5 shared hosting account is beyond my scope!
Thank you so much!
advantage of drupal + modules branches
I'm trying adopting a similar repository structure, but only utilize one branch for unmodified core and contrib.
What is the advantage to having core and contrib modules in separate branches?
re: advantage of drupal + modules branches
Geoff: The reason for having both drupal and modules branches is for the following situation.
If you have them together and you have a bunch of modules, then when you want to upgrade drupal and your replace the current drupal directory with what you just downloaded, you will loose all of your modules. By having a modules branch that pulls in the drupal branch, you can change the drupal branch and then just pull the changes into the modules branch.
Thank you very much for your
Thank you very much for your article. It is the most usefull informations about git workflow in Drupal development, I found on the net. I've made some modifications in your method. My branches are: core, contrib, develop and master. Master branch is used only for final production releases. I suppose, I will use hotfix and release branches too. This modification was inpired by http://nvie.com/posts/a-successful-git-branching-model/ Second modification is the deploy method. I am using simple git deploy based on post-update hook mentioned in http://joemaller.com/990/a-web-focused-git-workflow/ The last modification is dowload method - I use "drush dl", not wget. I've found drush very usefull and fast. I've tried submodules, nested repositories and git-subtree but they are too uncomfortable and have side effects.
Branching strategy... modules after configs?
Thank you for this very useful tutorial! I am setting up my own branching strategy and it occurs to me that module changes to my sites happen far more frequently than configuration changes. So it seems to me that it would be more logical to have the modules branch off of the configuration branch like this:
That would require less merging than the reverse. Or is there some other reason to make configuration branch off of modules? I suppose if you want to maintain multiple deployments (i.e. non-intersecting sets of site configs) in one repository, you could do it on several branches, but I think that could get confusing. Those multiple deployments probably won't share module installations anyway. I think for multiple deployments, I would use something like this:
Also, I tend to make patches to core drupal that are unique to some deployments. So I think I will put a core-patchesX branch off of each siteconfX branch, and module-patchesX branch off of each modulesX branch:
It seems like this is getting complicated, I wonder if I'm over-thinking things.
I agree that it's important
I agree that it's important that you pull the db down from the production site but what happens if you have installed a new module as part of the development cycle. If you bin your local db and pull down your production db you will (at the very least) lose this modules configuration settings....or am I missing something clever...