I am going to orient this answer as if the question was "what are the advantages of chef-solo" because that's the best way I know to cover the differences between the approaches.
My summary recommendation is in line with others: use a chef-server if you need to manage a dynamic, virtualized environment where you will be adding and removing nodes often. A chef server is also a good CMDB, if you need one. Use chef-solo if you have a less dynamic environment where the nodes won't change too often but the roles and recipes will. Size and complexity of your environment is more or less irrelevant. Both approaches scale very well.
If you deploy chef-solo, use a cronjob with rsync, 'git pull', or some other idempotent file transfer mechanism to maintain a full copy of the chef repository on each node. The cronjob should be easily configurable to (a) not run at all and (b) run, but without syncing the local repository. Add a nodes/ directory in your chef repository with a json file for each node. Your cronjob can be as sophisticated as you wish in terms of identifying the right nodefile (though I would recommend simply $(hostname -s).json. You also may want to create an opscode account and configure a client with hosted chef, if for no other reason than to be able to use knife to download community cookbooks and create skeletons.
There are several advantages to this approach, besides the obvious "not having to administer a server". Your source control will be the final arbiter of all configuration changes, the repository will include all nodes and runlists, and each server being fully independent facilitates some convenient testing scenarios.
Chef-server introduces a hole where you use the "knife upload" to update a cookbook, and you must patch this hole yourself (such as with a post-commit hook) or risk site changes being overwritten silently by someone who "knife upload"s an obsolete recipe from the outdated local repository on his laptop. This is less likely to happen with chef-solo, as all changes will be synced to servers directly from the master repository. The issue here is discipline and number of collaborators. If you're a solo developer or a very small team, uploading cookbooks via the API is not very risky. In a larger team it can be if you don't put good controls in place.
Additionally, with chef-solo you can store all your nodes' roles, custom attributes and runlists as node.json files in your main chef repository. With chef-server, roles and runlists are modified on the fly using the API. With chef-solo, you can track this information in revision control. This is where the conflict between static and dynamic environments can be clearly seen. If your list of nodes (no matter how long it might be) doesn't change often, having this data in revision control is very useful. On the other hand, if you're frequently spawning new nodes and destroying old ones (never to see their hostname or fqdn again) keeping it all in revision control is just an unnecessary hassle, and having an API to make changes is very convenient. Chef-server has a whole features geared towards managing dynamic cloud environments as well, like the name option on "knife bootstrap" which lets you replace fqdn as the default way to identify a node. But in a static environment those features are of limited value, especially compared to having the roles and runlists in revision control with everything else.
Finally, recipe test environments can be set up on the fly for almost no extra work. You can disable the cronjobs running on a server and make changes directly to its local repository. You can test the changes by running chef-solo and you will see exactly how the server will configure itself in production. Once everything is tested, you can check-in the changes and re-enable the local cronjobs. When writing recipes, though, you wouldn't be able to use the "Search" API, meaning that if you want to write dynamic recipes (eg loadbalancers) you will have to hack around this limitation, gathering the data from the json files in your nodes/ directory, which is likely to be less convenient and will lack some of the data available in the full CMDB. Once again, more dynamic environments will favor the database-driven approach, less dynamic environments will be fine with json files on local disk. In a server environment where a chef run must make API calls to a central database, you will be dependent on managing all testing environments within that database.
The last can also be used in emergencies. If you are troubleshooting a critical issue on production servers and solve it with a configuration change, you can make the change immediately on the server's repository then push it upstream to the master.
Those are the primary advantages of chef-solo. There are some others, like not having to administer a server or pay for hosted chef, but those are relatively minor concerns.
To sum up: If you are dynamic and highly virtualized, chef-server provides a number of great features (covered elsewhere) and most of the chef-solo advantages will be less noticeable. However there are some definite, often unmentioned advantages to chef-solo especially in more traditional environments. Note that being deployed on the cloud doesn't necessarily mean you have a dynamic environment. If you can't, for example, add more nodes to your system without releasing a new version of your software, you probably aren't dynamic. Finally, from a high-level perspective a CMDB can be useful for any number of things only tangentially related to system administration and configuration such as accounting and information-sharing between teams. Using chef-server might be worth it for that feature alone.
Edit This question and answer are years old. The definitive best practices are taught via the Learn Chef Rally self-paced training modules produced by Chef Software, Inc. The bulk of the original answer is below.
In this answer, "Chef" or "chef-client" usually refers to Chef Infra, the product. Opscode renamed to Chef Software, Inc in 2013. In April, 2019, Chef opened the source code for all its products, along with creating consistent brand naming.
Not clear if it's better to setup roles in ruby DSL, JSON, or from the management console? Why are there multiple ways to do the same thing?
2019 Update: Policyfiles are the best workflow to use. Roles are considered an inferior practice, and Chef Software, Inc. recommends migrating to Policyfiles.
There are multiple ways to do the same thing because people have different workflows. You pick the workflow that is best for your environment. Let me explain what the differences are so you can make an informed decision.
The Ruby DSL for Roles exists to make it easier to write roles without knowing the syntax of JSON. It is a simple way to get started with Roles. Once you have made changes, you upload them to the Chef Server with knife.
knife role from file myrole.rb
This converts the role to JSON and stores it on the server. If you have an environment that enforces the Chef Repository where your roles live as the source of truth, this works quite well.
JSON is what the Chef Server stores, so you also edit JSON directly in the management console. It does require more fields than the Ruby DSL in order for Knife to recognize it properly to upload. Those details are hidden to some degree via the web UI.
The disadvantage of using the webui/management console for editing roles is they aren't in your local version control system unless you download them from the server. You can do this with knife:
knife role show myrole -Fj
The -Fj
tells knife to "display in JSON format." You can redirect the output to a .json file if you like.
Years ago update: There are additional knife commands for working with the files in the local chef repository. Currently these commands support only JSON format files. A community RFC is open which will address adding support for the Ruby DSL for these plugins. Here's a brief summary of the workflow.
Check content differences between the server and the local file.
knife diff roles/myrole.json
Upload a JSON formatted role file. The roles/
path is required. This gets mapped to the same API endpoint on the server.
knife upload roles/myrole.json
Download the content from the server overwriting the content of the file in the repository.
knife download roles/myrole.json
These commands come from knife-essentials
, which is built into the chef client package.
Can you organize cookbooks into subdirectories? eg- we have custom software that I'd like to write a cookbook for and stick that into: chef-repo/cookbooks/ourcompanystuff/customsoftwarecookbook would this be a good practice?
No. Knife has an expectation of where cookbooks should live because it uses an API to upload cookbooks to the Server. This is set in the knife.rb
with cookbook_path
. In older versions of Chef Infra, you could specify an array of paths for cookbooks, but this is being deprecated because it required more maintenance and was confusing to users.
By convention we name customer specific or site specific cookbooks with the name prefixed in the cookbook diretory. For your example, it would be:
chef-repo/cookbooks/ourcompany_customsoftware
There might be multiple different cookbooks for "ourcompany" depending on what you're doing.
Further reference:
Do I create a cookbook for each type of role that specifies what it does? Do I have these cookbooks include other cookbooks (i.e.- the cookbook for my webserver role includes the apache cookbook). I'm not sure how cookbook inter-dependencies and inheritance are handled.
There is no direct relationship or dependency between roles and cookbooks.
Roles have a run list, which specifies the recipes and other roles that should be applied to any node that has that role. Nodes have a run list that can contain roles or recipes. When Chef runs on the node, it will expand the run list for all the roles and recipes it includes, and then download the cookbooks required. In a node run list:
recipe[apache2]
Chef will download the apache2
cookbook for the node so it can apply this recipe.
You might have a cookbook specific for a role in your infrastructure. More commonly you'll have cookbooks that are for setting up certain types of services like apache2, mysql, redis, haproxy, etc. Then you would put those into appropriate roles. If you have custom application specific things that need to happen to fulfill a role, then you could write this into a custom cookbook (like I referenced above).
Further reference:
Is there anything like puppets external node classifier so nodes automatically determine their roles?
"Yes." The Chef Infra Server does node data storage (in JSON) automatically, and the server also automatically indexes all the node data for search.
Further reference:
It seems like you can configure things with knife or within the management console, or editing JSON files? This is super confusing to me why there are so many ways to do things, it's paralyzing! Is there a reason to use one or the other?
The Chef Infra Server has a RESTful API that sends and receives JSON responses. Knife and the management console are user interfaces for interacting with the API from an administration point of view.
You can use the tool you like better, though the management console doesn't have as many features as Knife. Most people that use Chef Infra prefer the command-line interface for the power and flexibility it provides, even folks who are using Chef Infra on Windows. Further, knife
is a plugin based tool that you can create new plugins to interact with the Chef Infra Server, or with other parts of your infrastruture.
Chef Infra is a set of libraries, primitives, and an API. It gives you the flexibility to build the configuration management system that works best for your infrastructure.
Further reading:
How can I automatically provision nodes with chef in my dev cluster? With puppet I fire up a VM that connects to the puppermatser and kicks off a puppet run and sets itself up (role is determined by external node classifier). How do I do this with chef? - Install chef with pem/rb files that tie it to a chef server, manually tell the node its roles with knife or editing this in the management interface, and then kicking off a chef-client run to set itself up?
You'll want to use the knife bootstrap plugin. This is a built in plugin that comes with knife. You invoke it like this:
knife bootstrap 10.1.1.112 -x root -i ~/.ssh/root_id_rsa -r 'role[webserver]'
This will:
- SSH to the target system (10.1.1.112) as the
root
user using an SSH key (you could ssh as another user and then use --sudo
).
- Install Ruby
- Install Chef
- Create the Chef configuration file for your Chef Server, reading knife's configuration (.chef/knife.rb).
- Copy the "validation" RSA private key, which the node will use to automatically register with the Chef Server.
- Run
chef-client
using the comma separated run listed specified. In this example only the webserver
role is applied.
This assumes that the target system has been provisioned, has an IP address and you can SSH as root. Depending on your local policies and provisioning process, you may need to adjust how this works. The knife bootstrap page on the wiki describes more about how this works.
Knife also has plugins for a number of public cloud computing providers such as Amazon EC2 and Rackspace Cloud. There are plugins available for private cloud environments like Eucalyptus and OpenStack. There are also plugins for VMware, Vsphere and others. You can see further information in the documentation.
Further reading:
Are there any other good chef resources I might be missing in my searches?
The Chef Documentation the primary source of documentation.
The Learn Chef Rally is a series of self-guided modules that you can learn all about various aspects of Chef Infra, and other Chef products.
I used to maintain a blog where I posted tips, tricks, and guides about Chef Infra: http://jtimberman.housepub.org/. I had a series called "quick tips". Due to real life circumstances and other commitments, I no longer have time to maintain the site, but I may return to it in the future.
Chef customers can get help and support on the support site:
The Chef user community is an excellent source of additional help:
Additional resources are available on Chef Software, Inc.'s web site.
I hope this helps.
Best Answer
The
apache2
cookbook published by Opscode can do this.See the
web_app
andapache_site
definitions, documented usage in the README.md file (displayed by default on the link above).