I am assuming you're using "bridge-mode" for networking (your internal, virtual adapter is bridged to your host's physical adapter).
In any case (unless you explicitlly manually set them to the same address, which causes a lot of other problems), your guest (WinXP) machine will have a different MAC address than your host (CentOS). Due to bridge-mode, your host acts as an ethernet switch, and forwards packets to the guest.
So when an ARP broadcast comes, your host gets the packet and also forwards it to your guest machine. When packets for your host come, they are processed by your host's network stack. When packets for your guest come, the host forwards them to your guest, which then processes them as needed.
Switches operate at layer-2, e.g. ethernet, and they don't get involved with layer-3, e.g. IPv4, IPX, IPv6, AppleTalk, etc. This allows switches to switch traffic at layer-2 for any layer-3 protocol.
ARP is used by hosts to translate a layer-3 address to a layer-2 (MAC) address, so switches don't use ARP, nor are they even aware of the layer-3 addressing.
If a host doesn't have the layer-2 address for a particular layer-3 address in its ARP table, it will use ARP (broadcast) to discover the layer-2 address for that layer-3 address.
A switch has a MAC address table, not an ARP table like the hosts (except where it is a host for management purposes, but that has nothing to do with the switching function). While an ARP table can look up a layer-2 address from a layer-3 address, a switch MAC address table will look up a switch interface from a layer-2 address. Many people get this wrong.
If a switch doesn't have a layer-2 address in its MAC address table, then it will flood the frame to all interfaces, except the one where it entered the switch.
Switches will broadcast any frame with the broadcast layer-2 address to all switch interfaces, except the one where the frame entered the switch.
Multicast, at layer-2, where switches operate, is a form of broadcast, and multicast frames are treated like broadcasts. This has been mitigated by IGMP snooping in many new switches. This allows a switch to snoop on the IGMP requests by hosts to a multicast router. A switch with IGMP snooping enabled will learn and build a table of which interfaces have requested to join which IGMP groups, and it will only send traffic to those interfaces for that multicast group.
None of the switch behavior has anything to do with ARP or ARP tables in hosts.
Best Answer
When in doubt, the RFC which creates the protocol usually has the answer. RFC 826, An Ethernet Address Resolution Protocol has a description of what happens when a host receives an ARP request, including the rationale for why it is done the way it is. It says that you, "Swap hardware and protocol fields, putting the local hardware and protocol addresses in the sender fields." Some may interpret this to mean that you only change the sender fields, while other may think that swap means swap. This is how you get conflicting information and implementation of these protocols. Often a succeeding RFC will come out to explain which direction is the correct one. In this case, ARP is really a layer-2 communication, so the destination IP address isn't really the issue in a reply because the reply is back to the requesting layer MAC address, and it is not passed back up the stack to layer-3.
By the way, it doesn't matter if both devices are in the same subnet since a PC should never ARP for an IP address not in its subnet; it will send an ARP for the configured gateway IP address, instead of a destination IP address not in its subnet.