Chef remote_directory permissions

chefdirectorypermissions

I have checked https://docs.chef.io/resource_remote_directory.html and http://joerussbowman.tumblr.com/post/58241535331/chef-remote-directory-is-basically first of all.

chef-client version 12.2.1

OS CentOS 6.6

I'm doing a bit of work with deploying our tomcat deployment to a node, the recipe looks like this

user 'tomcat' do
  comment 'Tomcat User generated by chef'
  uid 2004
  home '/opt/tomcat'
  shell '/bin/bash'
end

remote_directory '/opt/tomcat' do
  source 'tomcat-6.0.35'
  owner 'tomcat'
  group 'tomcat'
  mode '0755'
  files_owner 'tomcat'
  files_group 'tomcat'
  files_mode '0644'
end

remote_directory '/opt/tomcat/bin' do
  source 'bin'
  files_owner 'tomcat'
  files_group 'tomcat'
  files_mode '0755'
  owner 'tomcat'
  group 'tomcat'
  mode '0755'
end

For some reason this would leave random directories as root.root

[~~~~~~~~~~~ tomcat]~ ll
total 88
drwxr-xr-x 4 root   root    4096 Apr  8 13:59 appconfig
drwxr-xr-x 2 tomcat tomcat  4096 Apr  8 13:59 bin
drwxr-xr-x 4 tomcat tomcat  4096 Apr  8 13:59 conf
drwxr-xr-x 2 tomcat tomcat  4096 Apr  8 13:59 lib
-rw-r--r-- 1 tomcat tomcat 37951 Apr  8 13:59 LICENSE
-rw-r--r-- 1 tomcat tomcat   558 Apr  8 13:59 NOTICE
-rw-r--r-- 1 tomcat tomcat  8680 Apr  8 13:59 RELEASE-NOTES
-rw-r--r-- 1 tomcat tomcat  6670 Apr  8 13:59 RUNNING.txt
drwxr-xr-x 3 root   root    4096 Apr  8 13:59 shared
drwxr-xr-x 7 root   root    4096 Apr  8 13:59 webapps

This odd behaviour continues throughout the tree where directories are not set to tomcat.tomcat (always the same ones are left as root.root)

The only addition recipes are a java one from the supermarket and a basic one to copy over mod_jk and install httpd.

So question is – is it me doing something silly, reading the documentation incorrectly or a glitch.

If it's me, what am I doing wrong. Cheers.

Note I have also tried adding the following, which still doesn't recurs correctly.

directory '/opt/tomcat' do
  owner 'tomcat'
  group 'tomcat'
  recursive true
end

Best Answer

First off, you're not losing your mind - this is actually the intended behavior for any directory-style resource when doing recursive directory creation.

The first level of the remote directory is being set correctly - it's the recursive ones that are not.

Docs reference:

The remote_directory resource can be used to recursively create the path outside of remote directory structures, but the permissions of those outside paths are not managed. This is because the recursive attribute only applies group, mode, and owner attribute values to the remote directory itself and any inner directories the resource copies.

Here's how I'd attempt to do this:

%w(
  /opt/tomcat/appconfig
  /opt/tomcat/bin
  /opt/tomcat/shared
  /opt/tomcat/webapps
).each do |path|
  remote_directory path do
    files_owner 'tomcat'
    files_group 'tomcat'
    files_mode '0755'
    owner 'tomcat'
    group 'tomcat'
    mode '0755'
  end
end

This will loop through all of the subdirectories in the list and set their permissions correctly.

There may be some tweaks to this block based on how the files are laid out in the cookbook structure, but the general message here is that you have to manage subdirectories.

Another approach would be to use a raw Ruby method to enforce permissions like so:

ruby_block 'set permissions for tomcat dir' do
  block do
    require 'fileutils'
    FileUtils.chown 'tomcat', 'tomcat', '/opt/tomcat'
  end
  action :run
end

For more on this approach, see:

Related Solutions

Chef best Practices/Questions

Edit This question and answer are years old. The definitive best practices are taught via the Learn Chef Rally self-paced training modules produced by Chef Software, Inc. The bulk of the original answer is below.

In this answer, "Chef" or "chef-client" usually refers to Chef Infra, the product. Opscode renamed to Chef Software, Inc in 2013. In April, 2019, Chef opened the source code for all its products, along with creating consistent brand naming.

Not clear if it's better to setup roles in ruby DSL, JSON, or from the management console? Why are there multiple ways to do the same thing?

2019 Update: Policyfiles are the best workflow to use. Roles are considered an inferior practice, and Chef Software, Inc. recommends migrating to Policyfiles.

There are multiple ways to do the same thing because people have different workflows. You pick the workflow that is best for your environment. Let me explain what the differences are so you can make an informed decision.

The Ruby DSL for Roles exists to make it easier to write roles without knowing the syntax of JSON. It is a simple way to get started with Roles. Once you have made changes, you upload them to the Chef Server with knife.

knife role from file myrole.rb

This converts the role to JSON and stores it on the server. If you have an environment that enforces the Chef Repository where your roles live as the source of truth, this works quite well.

JSON is what the Chef Server stores, so you also edit JSON directly in the management console. It does require more fields than the Ruby DSL in order for Knife to recognize it properly to upload. Those details are hidden to some degree via the web UI.

The disadvantage of using the webui/management console for editing roles is they aren't in your local version control system unless you download them from the server. You can do this with knife:

knife role show myrole -Fj

The -Fj tells knife to "display in JSON format." You can redirect the output to a .json file if you like.

Years ago update: There are additional knife commands for working with the files in the local chef repository. Currently these commands support only JSON format files. A community RFC is open which will address adding support for the Ruby DSL for these plugins. Here's a brief summary of the workflow.

Check content differences between the server and the local file.

knife diff roles/myrole.json

Upload a JSON formatted role file. The roles/ path is required. This gets mapped to the same API endpoint on the server.

knife upload roles/myrole.json

Download the content from the server overwriting the content of the file in the repository.

knife download roles/myrole.json

These commands come from knife-essentials, which is built into the chef client package.

Can you organize cookbooks into subdirectories? eg- we have custom software that I'd like to write a cookbook for and stick that into: chef-repo/cookbooks/ourcompanystuff/customsoftwarecookbook would this be a good practice?

No. Knife has an expectation of where cookbooks should live because it uses an API to upload cookbooks to the Server. This is set in the knife.rb with cookbook_path. In older versions of Chef Infra, you could specify an array of paths for cookbooks, but this is being deprecated because it required more maintenance and was confusing to users.

By convention we name customer specific or site specific cookbooks with the name prefixed in the cookbook diretory. For your example, it would be:

chef-repo/cookbooks/ourcompany_customsoftware

There might be multiple different cookbooks for "ourcompany" depending on what you're doing.

Further reference:

Do I create a cookbook for each type of role that specifies what it does? Do I have these cookbooks include other cookbooks (i.e.- the cookbook for my webserver role includes the apache cookbook). I'm not sure how cookbook inter-dependencies and inheritance are handled.

There is no direct relationship or dependency between roles and cookbooks.

Roles have a run list, which specifies the recipes and other roles that should be applied to any node that has that role. Nodes have a run list that can contain roles or recipes. When Chef runs on the node, it will expand the run list for all the roles and recipes it includes, and then download the cookbooks required. In a node run list:

recipe[apache2]

Chef will download the apache2 cookbook for the node so it can apply this recipe.

You might have a cookbook specific for a role in your infrastructure. More commonly you'll have cookbooks that are for setting up certain types of services like apache2, mysql, redis, haproxy, etc. Then you would put those into appropriate roles. If you have custom application specific things that need to happen to fulfill a role, then you could write this into a custom cookbook (like I referenced above).

Further reference:

Is there anything like puppets external node classifier so nodes automatically determine their roles?

"Yes." The Chef Infra Server does node data storage (in JSON) automatically, and the server also automatically indexes all the node data for search.

Further reference:

It seems like you can configure things with knife or within the management console, or editing JSON files? This is super confusing to me why there are so many ways to do things, it's paralyzing! Is there a reason to use one or the other?

The Chef Infra Server has a RESTful API that sends and receives JSON responses. Knife and the management console are user interfaces for interacting with the API from an administration point of view.

You can use the tool you like better, though the management console doesn't have as many features as Knife. Most people that use Chef Infra prefer the command-line interface for the power and flexibility it provides, even folks who are using Chef Infra on Windows. Further, knife is a plugin based tool that you can create new plugins to interact with the Chef Infra Server, or with other parts of your infrastruture.

Chef Infra is a set of libraries, primitives, and an API. It gives you the flexibility to build the configuration management system that works best for your infrastructure.

Linux Web Server – Proper Permissions for Website Files and Folders

When deciding what permissions to use, you need to know exactly who your users are and what they need. A webserver interacts with two types of user.

Authenticated users have a user account on the server and can be provided with specific privileges. This usually includes system administrators, developers, and service accounts. They usually make changes to the system using SSH or SFTP.

Anonymous users are the visitors to your website. Although they don't have permissions to access files directly, they can request a web page and the web server acts on their behalf. You can limit the access of anonymous users by being careful about what permissions the web server process has. On many Linux distributions, Apache runs as the www-data user but it can be different. Use ps aux | grep httpd or ps aux | grep apache to see what user Apache is using on your system.

Notes on linux permissions

Linux and other POSIX-compliant systems use traditional unix permissions. There is an excellent article on Wikipedia about Filesystem permissions so I won't repeat everything here. But there are a few things you should be aware of.

The execute bit
Interpreted scripts (eg. Ruby, PHP) work just fine without the execute permission. Only binaries and shell scripts need the execute bit. In order to traverse (enter) a directory, you need to have execute permission on that directory. The webserver needs this permission to list a directory or serve any files inside of it.

Default new file permissions
When a file is created, it normally inherits the group id of whoever created it. But sometimes you want new files to inherit the group id of the folder where they are created, so you would enable the SGID bit on the parent folder.

Default permission values depend on your umask. The umask subtracts permissions from newly created files, so the common value of 022 results in files being created with 755. When collaborating with a group, it's useful to change your umask to 002 so that files you create can be modified by group members. And if you want to customize the permissions of uploaded files, you either need to change the umask for apache or run chmod after the file has been uploaded.

The problem with 777

When you chmod 777 your website, you have no security whatsoever. Any user on the system can change or delete any file in your website. But more seriously, remember that the web server acts on behalf of visitors to your website, and now the web server is able to change the same files that it's executing. If there are any programming vulnerabilities in your website, they can be exploited to deface your website, insert phishing attacks, or steal information from your server without you ever knowing.

Additionally, if your server runs on a well-known port (which it should to prevent non-root users from spawning listening services that are world-accessible), that means your server must be started by root (although any sane server will immediately drop to a less-privileged account once the port is bound). In other words, if you're running a webserver where the main executable is part of the version control (e.g. a CGI app), leaving its permissions (or, for that matter, the permissions of the containing directory, since the user could rename the executable) at 777 allows any user to run any executable as root.

Define the requirements

Developers need read/write access to files so they can update the website
Developers need read/write/execute on directories so they can browse around
Apache needs read access to files and interpreted scripts
Apache needs read/execute access to serveable directories
Apache needs read/write/execute access to directories for uploaded content

Maintained by a single user

If only one user is responsible for maintaining the site, set them as the user owner on the website directory and give the user full rwx permissions. Apache still needs access so that it can serve the files, so set www-data as the group owner and give the group r-x permissions.

In your case, Eve, whose username might be eve, is the only user who maintains contoso.com :

chown -R eve contoso.com/
chgrp -R www-data contoso.com/
chmod -R 750 contoso.com/
chmod g+s contoso.com/

ls -l
drwxr-s--- 2 eve      www-data   4096 Feb  5 22:52 contoso.com

If you have folders that need to be writable by Apache, you can just modify the permission values for the group owner so that www-data has write access.

chmod g+w uploads

ls -l
drwxrws--- 2 eve      www-data   4096 Feb  5 22:52 uploads

The benefit of this configuration is that it becomes harder (but not impossible*) for other users on the system to snoop around, since only the user and group owners can browse your website directory. This is useful if you have secret data in your configuration files. Be careful about your umask! If you create a new file here, the permission values will probably default to 755. You can run umask 027 so that new files default to 640 (rw- r-- ---).

Maintained by a group of users

If more than one user is responsible for maintaining the site, you will need to create a group to use for assigning permissions. It's good practice to create a separate group for each website, and name the group after that website.

groupadd dev-fabrikam
usermod -a -G dev-fabrikam alice
usermod -a -G dev-fabrikam bob

In the previous example, we used the group owner to give privileges to Apache, but now that is used for the developers group. Since the user owner isn't useful to us any more, setting it to root is a simple way to ensure that no privileges are leaked. Apache still needs access, so we give read access to the rest of the world.

chown -R root fabrikam.com
chgrp -R dev-fabrikam fabrikam.com
chmod -R 775 fabrikam.com
chmod g+s fabrikam.com

ls -l
drwxrwsr-x 2 root     dev-fabrikam   4096 Feb  5 22:52 fabrikam.com

If you have folders that need to be writable by Apache, you can make Apache either the user owner or the group owner. Either way, it will have all the access it needs. Personally, I prefer to make it the user owner so that the developers can still browse and modify the contents of upload folders.

chown -R www-data uploads

ls -l
drwxrwxr-x 2 www-data     dev-fabrikam   4096 Feb  5 22:52 uploads

Although this is a common approach, there is a downside. Since every other user on the system has the same privileges to your website as Apache does, it's easy for other users to browse your site and read files that may contain secret data, such as your configuration files.

You can have your cake and eat it too

This can be futher improved upon. It's perfectly legal for the owner to have less privileges than the group, so instead of wasting the user owner by assigning it to root, we can make Apache the user owner on the directories and files in your website. This is a reversal of the single maintainer scenario, but it works equally well.

chown -R www-data fabrikam.com
chgrp -R dev-fabrikam fabrikam.com
chmod -R 570 fabrikam.com
chmod g+s fabrikam.com

ls -l
dr-xrwx--- 2 www-data  dev-fabrikam   4096 Feb  5 22:52 fabrikam.com

If you have folders that need to be writable by Apache, you can just modify the permission values for the user owner so that www-data has write access.

chmod u+w uploads

ls -l
drwxrwx--- 2 www-data  dev-fabrikam   4096 Feb  5 22:52 fabrikam.com

One thing to be careful about with this solution is that the user owner of new files will match the creator instead of being set to www-data. So any new files you create won't be readable by Apache until you chown them.

*Apache privilege separation

I mentioned earlier that it's actually possible for other users to snoop around your website no matter what kind of privileges you're using. By default, all Apache processes run as the same www-data user, so any Apache process can read files from all other websites configured on the same server, and sometimes even make changes. Any user who can get Apache to run a script can gain the same access that Apache itself has.

To combat this problem, there are various approaches to privilege separation in Apache. However, each approach comes with various performance and security drawbacks. In my opinion, any site with higher security requirements should be run on a dedicated server instead of using VirtualHosts on a shared server.

Additional considerations

I didn't mention it before, but it's usually a bad practice to have developers editing the website directly. For larger sites, you're much better off having some kind of release system that updates the webserver from the contents of a version control system. The single maintainer approach is probably ideal, but instead of a person you have automated software.

If your website allows uploads that don't need to be served out, those uploads should be stored somewhere outside the web root. Otherwise, you might find that people are downloading files that were intended to be secret. For example, if you allow students to submit assignments, they should be saved into a directory that isn't served by Apache. This is also a good approach for configuration files that contain secrets.

For a website with more complex requirements, you may want to look into the use of Access Control Lists. These enable much more sophisticated control of privileges.

If your website has complex requirements, you may want to write a script that sets up all of the permissions. Test it thoroughly, then keep it safe. It could be worth its weight in gold if you ever find yourself needing to rebuild your website for some reason.

Best Answer

Related Solutions

Chef best Practices/Questions

Linux Web Server – Proper Permissions for Website Files and Folders

Notes on linux permissions

The problem with 777

Define the requirements

Maintained by a single user

Maintained by a group of users

*Apache privilege separation

Additional considerations

Related Topic