Centos – systemd-journald fails to start on CentOS 7

centosjournaldsystemd

Since a week ago, journald doesn't finish starting anymore. When booting up, it starts up and logs the boot messages and a few other service startup logs, and then abruptly shuts down when starting Switch Root:

Mar 18 05:22:56 playground1.local systemd-journal[96]: Runtime journal is using 8.0M (max allowed 391.1M, trying to leave 586.7M free of 3.8G available → current limit 391.1M).
Mar 18 05:22:56 playground1.local systemd-journal[96]: Runtime journal is using 8.0M (max allowed 391.1M, trying to leave 586.7M free of 3.8G available → current limit 391.1M).
Mar 18 05:22:56 playground1.local kernel: Initializing cgroup subsys cpuset
Mar 18 05:22:56 playground1.local kernel: Initializing cgroup subsys cpu
Mar 18 05:22:56 playground1.local kernel: Initializing cgroup subsys cpuacct
Mar 18 05:22:56 playground1.local kernel: Linux version 3.10.0-327.10.1.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Tue Feb 16 17:03:50 UTC 2016
Mar 18 05:22:56 playground1.local kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-327.10.1.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/root rd.lvm.lv=centos/swap crashkernel=auto rhgb quiet LANG=en_US.UTF-8 console=ttyS0 systemd.debug

...

Mar 18 05:22:57 playground1.local systemd[1]: Reached target Switch Root.
Mar 18 05:22:57 playground1.local systemd[1]: Starting Switch Root.
Mar 18 05:22:57 playground1.local systemd[1]: Starting Switch Root...
Mar 18 05:22:57 playground1.local systemd[1]: Switching root.
Mar 18 05:22:57 playground1.local systemd-journal[96]: Journal stopped

If I try to start it manually, it fails quickly:

[root@playground1 ~]# systemctl start systemd-journald.service
Job for systemd-journald.service failed because the control process exited with error code. See "systemctl status systemd-journald.service" and "journalctl -xe" for details.
[root@playground1 ~]# systemctl status systemd-journald.service
● systemd-journald.service - Journal Service
  Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled)
  Active: failed (Result: start-limit) since Fri 2016-03-18 16:38:21 EDT; 21s ago
    Docs: man:systemd-journald.service(8)
          man:journald.conf(5)
  Process: 3404 ExecStart=/usr/lib/systemd/systemd-journald (code=exited, status=1/FAILURE)
Main PID: 3404 (code=exited, status=1/FAILURE)
  Status: "Shutting down..."
[root@playground1 ~]#

journalctl -xe prints the boot logs, when manually starting the service it doesn't get to actually write anything to the journal. If I delete the journal file, it doesn't get created.

Running /usr/lib/systemd/systemd-journald manually kind of works, meaning that the process is started, and journalctl contains the journald startup logs, but nothing else is actually logged afterwards.

systemd-journald.socket also seems to fail to load, and it might be the cause of the failure, but I don't see anything in the logs about it:

[root@playground1 systemd]# systemctl status systemd-journald.socket
● systemd-journald.socket - Journal Socket
  Loaded: loaded (/usr/lib/systemd/system/systemd-journald.socket; static; vendor preset: disabled)
  Active: failed (Result: service-failed-permanent) since Fri 2016-03-18 16:46:24 EDT; 4min 8s ago
    Docs: man:systemd-journald.service(8)
          man:journald.conf(5)
  Listen: /run/systemd/journal/stdout (Stream)
          /run/systemd/journal/socket (Datagram)
          /dev/log (Datagram)

I don't think it's related to SELinux, since the only thing that's denied is reading machine-id, which shouldn't be a showstopper. However, one weird thing that I did notice in the audit logs is that the service appears to be starting twice?

type=AVC msg=audit(1458334641.993:330): avc:  denied  { read } for  pid=3689 comm="systemd-journal" name="machine-id" dev="dm-0" ino=67109045 scontext=system_u:system_r:syslogd_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=SYSCALL msg=audit(1458334641.993:330): arch=c000003e syscall=2 success=no exit=-13 a0=7fbc38ebf08f a1=80100 a2=7ffeb87c80bf a3=3 items=0 ppid=1 pid=3689 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="systemd-journal" exe="/usr/lib/systemd/systemd-journald" subj=system_u:system_r:syslogd_t:s0 key=(null)
type=SERVICE_START msg=audit(1458334642.002:331): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
type=SERVICE_START msg=audit(1458334642.002:332): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
type=SERVICE_STOP msg=audit(1458334642.002:333): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

See the two SERVICE_START messages, one with res=failed and one with res=success.

Google tells me that nobody else had this problem. Any ideas?

Best Answer

The AVC indicates that the file being read doesn't have an SELinux label.

This can happen if SELinux was disabled when the file was created, or if it is created on a filesystem that isn't SELinux-capable (e.g. vfat).

If someone had disabled and re-enabed SELinux, then you should be able to fix the problem (and possibly many others) by relabeling the filesystem. You can do this online with:

restorecon -r /

If the filesystem on which the file resides doesn't support SELinux, then you need to replace it with a filesystem that does support SELinux, and then relabel.

Checking labeling

It is unlikely a label restore prevents you from starting a service if SELinux is in permissive mode. Also, why the -F switch?

To know if you have to restore the labeling of a directory or file, first find out what context a file or directory is supposed to have:

# matchpathcon /etc/openldap/
/etc/openldap   system_u:object_r:etc_t:s0

Then list its security context:

# ls -ldZ /etc/openldap/
drwxr-xr-x. root root system_u:object_r:etc_t:s0       /etc/openldap//

In this example, no further action is needed.

With regards to your issue, the problem is not labeling per se, but a missing type enforcement rule, i.e., a rule that allows a labeled process to transition from one confined domain to another, or to read files with an specific label, for example.

Creating an SELinux module

You can try to build a module that allows the slapd_t to perform the operations which appeared in the audit.log. It is probable that you need further adjustments in your code. Use audit2allow, and make for this task. All commands are very well documented in their respective manpages. The process will look roughly like this (after copying the relevant messages into audit.txt):

audit2allow -i audit.txt -m slapd -o slapd.te
make -f /usr/share/selinux/devel/Makefile load

Also, check if a bug report for the SELinux policy regarding this issue already exists.

Best Answer

Related Solutions

Centos – rsyslog-thesql on CentOS 5.3 does not have permission to access the thesql.sock

Centos – Configure selinux to allow openldap on CentOS 6.4

Checking labeling

Creating an SELinux module

Related Topic