I would strongly suggest that management reconsider trying to track things in this level of detail. It's going to be inherently subject to gaming.
I've seen clients attempt to do something similar but at a group level rather than at an individual level. What inevitably happened was that there was a strong incentive for each manager to get their group's bugs classified as low priority or classified as an enhancement and people started to get very defensive whenever there was a suggestion that there was a bug in their code. From a metric standpoint, it looked like code quality was up tremendously month over month but that was only because anything that didn't cause a total systems failure was being tagged as an enhancement or a low priority bug.
What is management trying to achieve by tracking developer effectiveness? If they want to improve overall code quality, it probably makes sense to have a feedback loop from the bug tracking system that tries to determine why a bug made it to post-production and what should be done in the future to prevent similar bugs. It may be that the requirements were unclear or inconsistent. It may be that the developer was sloppy. It may be that the QA department needs to more thoroughly test certain data conditions. It may be that management made a calculated decision to rush some functionality to hit an external deadline.
But if the intention is to improve code quality, this feedback loop has to be reasonably safe. That is, people have to have reason to trust that admitting to reasonable mistakes isn't going to cause problems for them down the line at review time. For example, if the QA department missed a bug because they're supposed to do dozens of poorly documented manual steps to test something and someone innocently missed a step, they have to feel safe in admitting the mistake so that management can identify the fact that they need to allocate time for someone to automate more of the QA process. If the problem is that the project manager made a last-minute change to the requirements which caused the developer to rush a change in and for QA to skimp on the testing, everyone needs to feel safe enough to discuss how they might have handled that situation differently in the future. If the folks that are most willing to admit to making mistakes in order to improve the process are the ones that are getting lower ratings during reviews because everyone else is pointing fingers and denying responsibility, you're not going to have a positive effect on code quality.
If you are going to report some sort of numeric KPI, the most meaningful numbers will have to come from something that the development staff cannot reasonably game and will have to come from a very coarse level of granularity. The set of numeric indicators that the development team cannot game tends to be very application- and organization-dependent. For example, you may be able to drive some metrics by parsing the application logs to look for certain types of errors (i.e. how many times did a user go to an error page because of an internal error). You may be able to drive metrics based on things like how quickly the software allowed a user to accomplish a particular task.
The set of things that the development team cannot game, however, is likely to result in metrics that apply to large swaths of the development organization. Performance-based metrics (i.e. our logistics software has improved inventory turn times 10% this year) are going to require that the entire team is working together from the developers to the DBAs to the hardware group. So they're not going to be meaningful to track how productive an individual or even a group is. But they are going to be the sorts of metrics that you actually want senior management to manage to. Senior management shouldn't care whether Jimmy the Developer is buggy code (though Jimmy's immediate manager should be aware). But they should be aware if Jimmy's buggy code is causing the call center's customer lookup operation to waste 10 hours of call center rep time every day or if some cool-looking new feature is chewing up 50% of the available CPU and slowing the rest of the system down.
Lower level managers can participate in the QA feedback loop and will interact regularly with the various development teams. It should be clear to them which individual developers are particularly strong and which are particularly weak. It should be clear where the recurring pain points are whether those pain points are communication or politics or developer strength. Having numeric KPIs at these low levels is going to be exceptionally difficult-- they are going to be too easy to game and they are going to create some perverse incentives. A developer's manager should understand whether a developer that is being assigned a lot of bugs is a weak developer that needs mentoring or a strong developer that is being exceptionally productive or an unlucky developer that has responsibility for a legacy module that is known to be exceptionally complex or buggy.
"...finds a bug, the report goes into a bug tracking database and also becomes a story which should be prioritized just like all other work.
The question is, should bug tracking and feature tracking be different, and can you use a single system to do both as well as schedule iterations/milestones/etc...
In terms of a "pure" Agile approach, you allow your team to use any combination of tools and processes that works well for them. Sure, you may find a single product that does everything, but perhaps it doesn't do some things as well as you'd like. If you run multiple systems, you need to determine just how integrated they need to be, and if any integration is needed, find the means to do it, and decide just how much information needs to be duplicated. It all boils down to a cost/benefit situation, so naturally any system employed needs to take into account the impact on a team's overall efficiency.
Where I work, we use a Redmine system to track bugs and features in a single system for multiple projects, with links between each project where dependencies exist. We create labels that relate to milestones, which for us are effectively long iterations that may range anywhere from a matter of weeks to a matter of months. For individual tasks and features, we tend not to track iterations too closely, so we have no need to worry about burn-down charts, white boards, sticky notes, feature cards and all of that stuff, as we've found that for our specific needs, some of this stuff is overkill. Each feature itself effectively represents small iterations of between 2-10 days duration, and for those that might care, we log our estimates of time versus actual time for later analysis. This may sound a little ad-hoc, but works for us and ultimately our real measure is working code within a series of time frames.
I suppose if we decided to employ another more formally "regimented" methodology, we might consider a tool to aid in tracking progress, but with what we currently have invested in our present method, we'd probably feed at a minimum the short feature descriptions and time data to another system, unless someone has developed a module for Redmine that does what we want it to, or if it became really important to us, we might create the Redmine module ourselves to avoid any nasty integration headaches that might concern us.
Best Answer
Given this statement:
you need to be looking at systems that have reporting tools that effectively allow the creation of spreadsheets in "real time" (or as close to it as possible). When you find one of these explain that having the developers use a "proper" system will mean that the data they're interested in will (hopefully) be more accurate and up to date (for example).