Judging by your other question, you're a Xilinx guy. So I highly suggest getting the data sheet for your Xilinx chip and going to the Functional Description chapter. For the Spartan 3 chip that I use, it's 42 pages of fun reading. It details exactly what components are inside an FPGA - the IOBs, CLBs, slices, LUTs, Block RAM, Multipliers, Digital Clock Manager, Clock Network, Interconnect, and some very basic configuration information. You need to understand this information if you want to know what a "compiled HDL" looks like.
Once you're familiar with your FPGA's architecture, then you can understand this process. First, your HDL design is run through the synthesis engine, which turns your HDL into basically RTL. Then the Mapper processes the results from Synthesis, "mapping" them onto the available pieces of FPGA architecture. Then the Router does Place And Route (PAR), which figures out where those pieces go and how to connect them. Finally, the results from PAR are turned into a BIT file. Typically this BIT file is then transformed in some way so that it can be loaded into a Flash chip, so that the FPGA can be programmed automatically when it powers up.
This bit file describes the entire FPGA program. For instance, the CLBs in a Spartan 3 are composed of slices, which are composed of LUTs, which are just 16-address 1-bit SRAMs. So one thing the BIT file will contain is exactly what data goes into each address of the SRAM. Another thing the BIT file contains is how each input of the LUT is wired to the connection matrix. The BIT file will also contain the initial values that go inside the block RAM. It will describe what is connected to the set and reset pins of each flip flop in each slice. It will describe how the carry chain is connected. It will describe the logic interface for each IOB (LVTTL, LVCMOS, LVDS, etc). It will describe any integrated pull-up or pull-down resistors. Basically, everything.
For Xilinx, the FPGA's memory is cleared when configuration is initiated (i.e. PROG_B is asserted). Once memory is clear, INIT_B goes high to indicate that phase is complete. The BIT file is then loaded, either through JTAG or the Flash chip interface. Once the program is loaded, the Global Set/Reset (GSR) is pulsed, resetting all flip flops to their initial state. The DONE pin then goes high, to indicate configuration is complete. Exactly one clock cycle later, the Global Three-State signal (GTS) is released, allowing outputs to be driven. Exactly one clock cycle later, the Global Write Enable (GWE) is released, allowing the flip flops to begin changing state in response to their inputs. Note that even this final configuration process can be slightly reordered depending on flags that are set in the BIT file.
EDIT:
I should also add that the reason the FPGA program is not permanent is because the logic fabric is composed of volatile memory (e.g. SRAM). So when the FPGA loses power, the program is forgotten. That's why they need e.g. Flash chips as non-volatile storage for the FPGA program, so that it can be loaded whenever the device is powered on.
FPGAs can replace digital logic as well as (in mixed-signal FPGAs) a few analog components (mostly larger bits like ADCs/DACs, comparators, PLLs, etc.).
The analog/passive equivalent is the FPAA, and there are a few examples already in existence such as Lattice Semiconductor's ispPAC line.
Best Answer
Do not use Clocking Wizard to generate timing pulses for external use, as SCL serial clocks for I2C or SPI protocol or whatever else. The Clocking Wizard application area is stated in a Vivado Design Suite document Clocking Wizard v6.0 LogiCORE IP Product Guide, page 7:
The creation of clock networks for use within your FPGA chip, and not of clock pulses/signals that may be used elsewhere.
Read also about clock networks, clock routing and other clocking features, what these do in your FPGA and why these are outright waste of resources to carry your "1Hz virtual clock" signals.
If you want to build "a 1Hz virtual clock" as an exercise in building resource-optimized variable length shift registers, follow instructions from a Vivado Design Suite document RAM-Based Shift Register v12.0 LogiCORE IP Product Guide or similar documents of your choice. The output of your "virtual clock" generating circuit goes to an output IO pin, and not to a clock region; you need not route this signal around within your FPGA.
If you are interested in how the design tools implement your VHDL/Verilog code, read about slice registers/slice LUTs/distributed RAM.
Do not worry about "burning up millions of cycles just to count": FPGA is made to run millions and even more of cycles; IP cores take care of optimizing power and resource costs to help FPGA do a useful job. Just notice that for tasks similar to your hypothetical task, the hardware solutions sort of RTC chips with 32.768KHz crystals are recommended.