Typically ASIC design is a team endeavor due to the complexity and quantity of work. I'll give a rough order of steps, though some steps can be completed in parallel or out of order. I will list tools that I have used for each task, but it will not be encyclopedic.
Build a cell library. (Alternatively, most processes have gate libraries that are commercially available. I would recommend this unless you know you need something that is not available.) This involves designing multiple drive strength gates for as many logic functions as needed, designing pad drivers/receivers, and any macros such as an array multiplier or memory. Once the schematic for each cell is designed and verified, the physical layout must be designed. I have used Cadence Virtuoso for this process, along with analog circuit simulators such as Spectre and HSPICE.
Characterize the cell library. (If you have a third party gate library, this is usually done for you.) Each cell in your library must be simulated to generate timing tables for Static Timing Analysis (STA). This involves taking the finished cell, extracting the layout parasitics using Assura, Diva, or Calibre, and simulating the circuit under varying input conditions and output loads. This builds a timing model for each gate that is compatible with your STA package. The timing models are usually in the Liberty file format. I have used Silicon Smart and Liberty-NCX to simulate all needed conditions. Keep in mind that you will probably need timing models at "worst case", "nominal", and "best case" for most software to work properly.
Synthesize your design. I don't have experience with high level compilers, but at the end of the day the compiler or compiler chain must take your high level design and generate a gate-level netlist. The synthesis result is the first peek you get at theoretical system performance, and where drive strength issues are first addressed. I have used Design Compiler for RTL code.
Place and Route your design. This takes the gate-level netlist from the synthesizer and turns it into a physical design. Ideally this generates a pad-to-pad layout that is ready for fabrication. It is really easy to set your P&R software to automatically make thousands of DRC errors, so not all fun and games in this step either. Most software will manage drive strength issues and generate clock trees as directed. Some software packages include Astro, IC Compiler, Silicon Encounter, and Silicon Ensemble. The end result from place and route is the final netlist, the final layout, and the extracted layout parasitics.
Post-Layout Static Timing Analysis. The goal here is to verify that your design meets your timing specification, and doesn't have any setup, hold, or gating issues. If your design requirements are tight, you may end up spending a lot of time here fixing errors and updating the fixes in your P&R tool. The final STA tool we used was PrimeTime.
Physical verification of the Layout. Once a layout has been generated by the P&R tool, you need to verify that the design meets the process design rules (Design Rule Check / DRC) and that the layout matches the schematic (Layout versus Schematic / LVS). These steps should be followed to ensure that the layout is wired correctly and is manufacturable. Again, some physical verification tools are Assura, Diva, or Calibre.
Simulation of the final design. Depending on complexity, you may be able to do a transistor-level simulation using Spectre or HSPICE, a "fast spice" simulation using HSIM, or a completely digital simulation using ModelSim or VCS. You should be able to generate a simulation with realistic delays with the help of your STA or P&R tool.
Starting with an existing gate library is a huge time saver, as well as using any macros that benefit your design, such as memory, a microcontroller, or alternative processing blocks. Managing design complexity is a big part as well - a single clock design will be easier to verify than circuit with multiple clock domains.
The simplest solution is split output and input for the driver modules.
module driver0 #(paramerter N=8)(input [N-1:0] sig_in, output sig_out);
assign sig_out = 'b1; // drive with some real value
endmodule
module driver1 #(paramerter N=8)(input [N-1:0] sig_in, output sig_out);
assign sig_out = 'b1; // drive with some real value
endmodule
You can allow the output to also be an input of the same module in top
module top();
paramerter N=8;
wire [N-1:0] sig;
driver0 #(N) u0(.sig_in(sig), .sig_out(sig[0]));
driver1 #(N) u1(.sig_in(sig), .sig_out(sig[1]));
endmodule
Or you can prevent the driver's output to feedback by slicking the array for the input. This may take a bit more caution. You could partition the inputs signals as two or more as well. Which ever is more intuitive easier manage for the project.
module top();
paramerter N=8;
wire [N-1:0] sig;
driver0 #(N-1) u0(.sig_in(sig[N-1:1]), .sig_out(sig[0]));
driver1 #(N-1) u1(.sig_in({sig[N-1:2],sig[0]}), .sig_out(sig[1]));
endmodule
Since IEEE Std 1364-2001, Verilog allows some fancy port definitions. See IEEE Std 1364-2001 § 12.3.3 Port declarations or SystemVerilog's IEEE Std 1800-2012 § 23.2.2 Port declarations
Support for this may very across tools as this practices is not common. If your synthesizer has issues with interface
it may have issues with port aliasing as well.
module driver0 ( .sig({sig_in,sig_out}) );
input sig_in;
output sig_out;
assign sig_out = 'b1; // drive with some real value
endmodule
module driver1 ( .sig({sig_in,sig_out}) );
input sig_in;
output sig_out;
assign sig_out = 'b1; // drive with some real value
endmodule
module top();
wire [1:0] sig;
driver0 u0(.sig(sig));
driver1 u1(.sig(sig));
endmodule
If the bus is going across several lays of hierarchy then the above solution will likely be tedious to manage. An interface
should be used. Some guidelines to reduce synthesis limitations:
- Define all signals as a
logic
type. Only exception are tri-states where the bits are expected to have two or more drivers.
- Define all tri-states as
wire
or tri
- Assign all
logic
types within a always_ff
, always_comb
, always_latch
(if latch is necessary block. logic
types can be assigned with assign
statements, but not all tools validate driving conflicts with assign
- Keep tri-states assignments as simple of possible
- Ex:
assign myif.io = drive_enable ? io_out : 'z;
Do not use interface
as a port in the top most module that will be synthesized. Many current synthesizers will flatten and localize interfaces with escaped names. If needed, create wrapper module as a translation layer between interface and top most port signals. Ex:
interface my_interface(input clk,
/*May need to declare top port level tri-stats as an interface port*/
inout [7:0] io, ...
/*Optional output logic ... , input ... */ );
logic in_a, in_b, out;
//wire [7:0] io; // internal tri-state
...
endinterface : my_interface
module same_port #(parameter N=1) (a,a); // Synthesizer might support this, see manual
inout [N-1:0] a;
endmodule : same_port
...
// RTL top with interface as port
module my_top(my_interface myif);
...
sub sub0 ( .myif(myif), .*);
...
endmodule : my_top
// wrapper to use for synthesis
module synth_top( output logic out, input in_a, in_b, clk, inout [7:0] io, ...);
my_interface myif( .* );
my_top mytop( .myif(myif) );
// connect myif sigs with port sigs thate are not already connected with .*
always_comb begin : conn_in
myif.in_a = in_a;
myif.in_b = in_b;
...
end : conn_in
always_comb begin : conn_out
out = myif.out;
...
end : conn_out
/* connect tri-state this way may not be supported by all synthesizers,
* refer to manual */
//same_port #(8) conn_inout_io(io, myif.io);
...
endmodule : synth_top
Best Answer
Most toolchain manufacturers offer some form or other for HLS synthesis. But how good that is will depend highly on how much you pay. The cheap ones will, well, be cheap. And non-cheap means you pay a substantial amount of money every year for the tool. For most companies, it is thus more cost efficient to use people for HLS synthesis than paying for some tool.