I'd probably use Ansible. It's a very simple configuration management / orchestration engine that's far simpler to get started with than Puppet (Puppet used to be my go-to choice for this, but not always now, having discovered Ansible).
The benefit of Ansible here is that it communicates directly over SSH, so you'd be able to get started using just your existing SSH credentials and workflow.
If you're currently configuring your BMCs with ipmitool, you'd be able to do something like:
Define a Hosts file -- This tells Ansible which hosts are in the bmc group (in this case), and which to run stuff on.
[bmc]
192.168.1.100
192.168.1.101
192.168.1.102
And so on... You can also use hostnames in that file, as long as they're resolvable.
Then create a "playbook", which is the set of commands to run on each host in a host-group.
You want to have this kind of top-down directory layout:
ansible/
playbooks/
bmc.yml
roles/
bmcconfig/
files/
handlers/
main.yml
tasks/
main.yml
templates/
group_vars/
all
A playbook has Roles, which are little sections of configuration that you can break down and reuse.
So I'd create a file called bmc.yml
(All Ansible configuration is in YAML files)
---
- name: Configure BMC on the hosts
hosts: bmc
user: root
roles:
- bmcconfig
Then inside roles/bmcconfig/tasks/main.yml
you can start listing the commands that are to be run on each host, to communicate with ipmi.
---
- name: Install ipmitool
apt: pkg=ipmitool state=installed
- name: Run ipmitool config
shell: ipmitool -your -options -go -here
When you run the playbook, with ansible-playbook -i hosts bmc.yml
the commands listed in tasks/main.yml
for each role will be executed in top-down order on each host found in the bmc
hostgroup in hosts
group_vars/all
is an interesting file, it allows you to define key-value pairs of variables and values that can be used in your playbooks.
so you could define something like
ipmitool_password: $512315Adb
in your group_vars/all
and as a result, you'd be able to have something like:
shell: ipmitool -your -options -go -here --password=${ipmitool_password}
in the playbook.
You can find out way more information about how to use the "modules" - the components of Ansible that allow you to do stuff, how to write your own :D, and so on at the Ansible Documentation Pages.
Best Answer
Obviously, if the OS is frozen you can't rely on anything but a hardwired mecanism into the motherboard.
Some uninterruptible power supply can be remotely controlled.
You can also use a watchdog PCI card.
You can find tons of devices like these: http://dataprobe.com/remote-reboot.html