Linux – Sysadmin performance metrics

linuxperformance

I work at a dot com and part of our team's responsibility is to maintain the production web application and server farm. Only recently did our department even get created, and now we have a huge amount of catchup patching servers, and implementing monitoring and backups.

To start on this monster we've broken it down into phases, and as part of our first phase, we are reinstalling OS'es on several servers getting them updated from old Redhat 8 (not fedora 8) OS installs. As a webapp, the servers need to run apache and php. The modules that need to be compiled into these programs are documented, and an old build process for compiling is documented.

As sys admins, what do you guys out there expect to have documented, and what should you be documenting? Since both build process and documentation need to be updated, what is the best way to go about laying out the items that need to be done? Should defining the steps be part of the sys-admin's job, or part of the technical manager's job? Is this part of the qualification of being a "senior unix engineer" vs a junior engineer? What standard would you want to be held to evaluating your performance on a project like this if it would affect your performance review?

Edit:
The application is under continuous development. A majority of it was written in PHP4 and continues to run on PHP4, however, a newer code running as a web service runs as PHP5. So on the same boxes there is both a php4 and a PHP5 installation. The modules required for each build are documented. The sysadmin has that doc.

Best Answer

If it's a unique problem, how can you measure whether the problem lay in the person or the problem?

You should be documenting everything that would be required to get your department running if half your people are killed/fired/etc...if you needed to rebuild the department with new admins, they should be able to get things running again at a new location with your documentation.

In practice...hee! Yeah, right. You're lucky if the docs are kept up to date if they're even created in most places.

If you're managing the monster tasks perhaps you need to just meet up with your admins and ask how things are going and what's been tried. If in this three weeks he's been tasked with just this problem and it's not getting solved, is it because he's not working on it? What has he tried to rectify the issue?

You can't micromanage the issue or he'll probably start fighting you on it. The sysadmins need enough freedom to work without feeling like he's being scrutinized every step. But if the project or task is really far behind, then you have a legitimate concern. Find out from him if there's something he needs in order to get the job done, or what the problem is that he is having difficulty overcoming.

Good book: Managing Humans by Michael Lopp.

Performance should be based on how well IT issues are addressed to meet the needs of users, along with maintenance of the servers and infrastructure issues. You can't possibly reduce the issue down to "solving X issues a day" or "writing X lines of code" to measure each employee.

Maybe you can get input from others on the team to get some feedback on how each other is doing or what major needs are. Good techies want to work with good techies. They don't want to work with people that are "happy and nice" but incompetent. They'll work with a grumpy curmudgeon who hates being in the room with them if it means that everything works well and the curmudgeon knows his stuff.

Related Topic