Skip to content

Latest commit

 

History

History
1661 lines (1087 loc) · 86.2 KB

File metadata and controls

1661 lines (1087 loc) · 86.2 KB

Advance Physical Design RTL2GDS using OpenLane/SKY130

The following repository consists of knowledge gained and steps followed while doing the Advanced Physical Design Using OpenLANE/SKY130 workshop. The workshop focuses on the complete ASIC flow approach from RTL2GDS using open soucrce EDA tools such as OpenLANE/SKY130. RISC-V architechture is followed for designing the the core of PICORV32A.

Table of Content

About RTL to GDSII Flow

RTL (Register tranfer level) to GDSII (Graphic Data Stream) flow consists of the complete set of steps required to create a file which could be sent for tapeout. The RTL code is synthesized and optimised. After sysnthesis of the code, PnR, floor and power planning is done while keeping in check the timing constraints. At the end GDSII file is written out. The complete flow consists of following steps:

  • Writting RTL
  • Synthesis
  • STA (Static Timing Analysis)
  • DFT (Design for Testability)
  • Floorplanning
  • Placement
  • CTS (Clock Tree Synthesis)
  • Routing
  • GDSII Streaming

SKYWater130 PDK

It is a Open source PDK (Process Design Kit) which is released by the collabration of Google and SkyWater Technologies foundary. Currently this technology has a target node of 130 nm. It is open to everyone and can be accessed at SkyWater Open Source PDK. This PDK is extremely flexible as it provides many optional featurs as standard features. Hence it povide designers with wide range of design choice.

OpenLANE

It is an open-source VLSI flow created using open source tools. Basically it is collection of various scripts which invoke and execute these tools in right sequence, modifies inputs and outputs and gives an organised results.

Tools Used

Tool Used for
Yosys Synthesis of RTL Design
ABC Mapping of Netlist
OpenSTA Static Timing Analysis
OpenROAD Floorplanning, Placement, CTS, Optimization, Routing
TritonRoute Detailed Routing
Magic VLSI Layout Tool
NGSPICE SPICE Extraction and Simulation
SPEF_EXTRACTOR Generation of SPEF file from DEF file

AIM - The main objective of the ASIC Design flow is to take the design from RTL to GDSII format.

Day 1 - Inception of open-source EDA, OpenLANE and Sky130 PDK

How to talk to computers

IC Terminologies

In the complete flow to this RTL2GDS physical designing there are lot of terminologies one comes across. Some of these terms are described below.

  • Package - ICs are basically presents as packages. These packages are materials which contains the semiconductor device. These packages protect the device from damage. these are of various kind. An example of QFN-48 (Quad Falt No-Leads) with 48 pins is taken here.

package

  • Chip - It sits in the centre of the package. The chip is connected to the package pins using wire bond. Inside the chip we have various components such as pad, core, interconnects, etc.
  • Pads - These are the itermediate structure through which the internal signals from the core of IC is connected to the external pins of the chip. These pads are organised as Pad Frame. There are different kind of pads for input, output, power supply and ground.
  • Core - It is the place where all the logic units (gates, muxs, etc) are presnet inside the chip. These are able to execute the set of instructions given to the chip and produce an output.
  • Die - It is the block which consists of semiconducting material and it can be used to build certain functional cuircuit which can be further sent for fabrication. It is the entire size of the chip. Die

Introduction to RISC-V

RISC-V is an open instruction set architechture rooted on reduced instruction set computer principles. It is an open source ISA used for processor design.

RISC-V Characterstics

  • It uses one clock cycle per instruction.
  • It follows the th RISC Princples.
  • It has both 32-bit and 64-bit varients. It also support floating point instruction.
  • It avoids micro-architechture or technology dependent features.
  • It accelerates the time for design to reach the market as it uses open-source IP.

Software to Hardware

The flow shows how the high level language (at software end) gets converted to machine language (at hardware end) and then gets executed on the package.

What happens when we run a program?

Suppose a C program needs to run on a hardware. So we nned to pass this C program to the hardware. So firstly the C program is compiled into assembly language (RISC-V assembly language program). Now this assmebly language is converted into the machine language program (basically 1's and 0's). Now this 1's and 0's are understanable by the hardware.

How does an application run on a computer?

  1. The application software enters the system software (major component of it are OS, Compiler and Assembler).
  • The OS handles I/O operations, memories and many low level functions.
  • then the program passes to Compiler which changes the program to Assembly language (compiled into instructions depends upon the hardware).
  • Now the instruction set goes to Assembler. Assembler converts the instruction set to machine language (binary numbers).
  1. The system software converts the apllication software into binary language.
  2. Now these binary numbers enter our chip layout and according the function is performed.

image

SoC design and OpenLane

Introduction to Digital design

For designing Digital ASIC ICs we require following components and some of it's opensource resources are also mentioned.

  • RTL models (old IP's) {github.com, librecores.org, etc}
  • EDA tool {OpenROAD, OpenLANE, etc}
  • PDK Data {SKYWater 130}

In the workshop every component is used from sources which are open soucre. The following image gives an idea about each component as an open source resource.

image

What is a PDK?

PDK stands for Process Design Kit, it is provided by foundaries and it consists of library or set of building blocks which are used to build ICs. Each component in the library is seperate building bolck and ae made following certain foundary rules.

PDKs acts as an inteface between the FABs and the designeers. PDKs have collection of files whcih are used to model a fabrication process for the EDA tools used to design an IC. PDK consists of tecnology node information, Process Design Rules (to verify DRC, LVC, PEX, etc), device model, I/O libraries, Standard cell libraries, macros files, lef files, etc.

Google along with SKYWater made the laters PDK opensource (130 nm node). The PDK only need data information for successful implementation.

Environment Setup

The OpenLANE flow requires various open source tools as well as their supporting tools to be installed for the complete Physical design flow. Installing this tools one by one is tedious as well as one can get lost in the steps. Installation can be done easily using some set of scripts present in following repositories VSDFlow (for installing Yosys, OpenSTA, Magic, OpenTimer, netgent, etc) and OpenLANE Build Scripts.

Simplified RTL to GDSII Flow

gnome-shell-screenshot-6miv9a

The flow starts from the HDL code i.e.RTL model and ends with GDSII file. The major implimenation steps are:

  • Synthesis - During synthesis the HDL design is translated into circuits, which are made up of components present in the standard cell library. The resultant circuit is described in HDL and its referred as gate level netlist which is functional equivalent of RTL code. The library building block of cell have regular layouts, cell layout is enclosed by fixed height (rectanglar in shape) whereas the width is variable an is discrete i.e., integer multiple of unit call side width.

image

  • Floor Planning - In Floor planning the chip area is being planned which in turn creates a robust Power distribution to power the circuits. The die is partitioned into different building blocks or components, also the I?O pads are distributed. During macro floor planning macro dimensions, it's pin locations and row definations i.e, rows and routing plan.

image

  • Power Planning - The power network is constructed typically for a chip was it has to power multiple VDD and ground pins. The power pin are connected to all component through rings and multiple horizontal and vertical strips. Sach parallel structure is meant to reduce the resistance.

gnome-shell-screenshot-12eny2

  • Placements - For macros we place the GATE level netlist cell on vertical rows. To reduce the interconnect delay conical cells are placed very close to each other and this is also done to enable successful routing afterwards. Placement is done in two ways Global placement and detailed placement. Global placement provide optimal result and these may or may not be legal where as the detail placement is always legal.

image

  • Clock Tree Synthesis (CTS) - Before signal routing clock routing is done so that the clock distribution is done to every sequential block. Clock distribution network delivers the clock to each of the sequential block. It is done so that there is minimum skew and latency. It usually follows a shape i.e., H-tree, X-tree, etc.

image

  • Routing - The signal routing is done using metal layers. It is essential to find valid pattern of horizontal and verticle wires to implement the nets that connects the cells together. Router uses the available metal layers as defined by the PDK. For each metal layer the PDK defines the thickness, width, pitch and vias. Vias are used to connect two metal wires. SkyWater 130nm has 6 metal layers.

image

  • Verification and Sign-offs - After PnR and CTS we perform verifications, to check whether our layout is valid or not. These verifications consists of Physical verification such as DRC and LVS. Design Rules Checking (DRC) ensures that the layout follows the design rules and Layout Vs Schematic ensures that the final layout is as per the synthesised gate level netlist or not. Finally Static Timing Analysis is done (STA) to make sure that all the timing constraints are met by the circuit.

About OpenLANE

OpenLANE is a flow which uses various open source tools for the RTL to GDSII flow. It has the striVe family of open everything SoCs (Open PDK, Open EDA, Open RTL). The various tools it uses are Yosys, OpenROAD, Magic, Netgen, SPEF_Extraction, etc.

  • It has two mode of Operations: Autonomus and Interactive

  • It is tuned for SKYWater 130nm open PDK.

  • OpenLANE ASIC flow is shown below. image

  • The flow starts with RTL Synthesis. RTL is fed to Yosys with the design constraints. Yosys translates the RTL into a logic circuit using generic components.

  • the circuit can be optimized and then mapped with standard cell library usin the tool abc. There are abc scrript to guide the optimization. OpenLANE has several abc scripts which has different synthesis statergies (least area, least power consumption, etc). The synthesis exploration utility is for statergy exploration and report generation.

  • OpenLANE has design exploration utility which can be used to sweep the design configurations (16 in total) and it genrates reports which has different design matrix and also shows the number of violations in layout. It is used for regression testing and to find the best configuration of our design.

  • OpenSTA performs the Static timing analysis on the netlist which is generated during synthesis.

  • Now after synthesis, the testing part starts (DFT) i.e., scan insertion, Automatic Test Pattern Generation (ATPG), Test Pattern Compaction, Fault Coverage and Fault Simulation. This step is optional.

  • Nest step is Physical implementation. This part is done with the help of OpenROAD application. It performs PnR which consists of FP+PP, Placement (Global and Detailed), Optimization, CTS and routing (Global and Detailed). TritonRoute is used for detailed routing.

  • Logic equivalence checking (LEC) is performed as the circuit changes due to optimization process as compare to the one generated during synthesis. This is done using Yosys tool to make sure the functionality is equivalent.

  • During insertion there is a special step that is fake antenna insertion. It is required to address the antenna rule violations. The concept of fake antenna is something like we have already considered the antenna so that on later stage we do not have any antenna violations. Hence we add fake antenna diode next to every cell input after placement. Then antenna checker is run from the Magic tool against the layout.

  • Fake antenna diode cell is created and added to standard cell library.

/hence a long wire is simulated it acts as an antenna but as a conductor it collects charges which can damage the transistor gates connected to the wire during fab. So the length of wire connected to transistor gate must be limited. This is done by the help of Router./

  • The sign off include STA, DRC and LVS. It also involves interconnect RC extraxtion from the routed layout followed by STA using OpenSTA.
  • Physical signoffs include DRC and LVS. DRC and LVS is performed using Magic tool. Circuit extraction is done NetGen.

Getting familier to open-source EDA tools

Contents of the OpenLANE Directory

The following content is specific to the workshop. There are lot of other files present in the directory too.

  1. OpenLane folder - It contains all the tools and the file that need to be invoked during the flow.
  2. Designs - This folder consists of all the designs requried during the flow (picorv32a is the design used in this workshop)
  3. PDKs - This folder contains all the pdk related files as well as information. (open pdk, Sky130, Skywater pdk).
  • open pdk consists of the scripts.
  • sky130A pdk consists of the libs.ref (has files specific to process such as timing, lef-both tech and cell) and libs.tech (has all the files specific to the tool) files.
  • skywater pdk consists of skywater 130 nm pdks.

NOTE: - Here sky130_fd_sc_hd libs.tech is being used. 4. config files - It bypasses any configuration that has already been done i.e., many of the switches use default value that is already present in the OpenLane flow. The precedence order of Openlane settings are:

  • sky130_xyz_config.tcl
  • config.tcl
  • Default value (already set in OpenLane)

LAB Day 1

Step 1: Starting OpenLane

  • Go to openlane folder.
cd work/tools/openlane_working_dir/openlane
  • Then run the docker command.
docker
  • Now run the flow.tcl file with interactive mode.
./flow.tcl -interactive
  • Now import packages
package require openlane 0.9

image

  • Now we are good to go to execute our commands.

NOTE - The above commands are to be run everytime we use OpenLANE for RTL2GDSII flow.

Step 2: Design Preperation

  • Knowing the contents of our design (picorv32a) folder.
  1. src
  2. sky130A_sky130_fd_sc_ns_config.tcl
  3. sky130A_sky130_fd_sc_ls_config.tcl
  4. sky130A_sky130_fd_sc_hs_config.tcl
  5. sky130A_sky130_fd_sc_hdl_config.tcl
  6. sky130A_sky130_fd_sc_hd_config.tcl
  7. config.tcl

Checking our config.tcl file values by running the below command in picorv32a folder (it has clock period of 5 unit)

less config.tcl

image

  • Creating file for our design i.e., setting up the design. It merges the cell LEF files and the technology LEF files generating merged.lef which is present in the temp folder.
prep -design picorv32a

image

This marks the creation of new folder inside picorv32a named as runs folder which consists of new folder whose name is the date on which the command is run. The following folder has results, reports, command logs, PDK Sources, etc files.

image

Step 3: Running Synthesis

Yosys synthesis is run when the command for synthesis is entered. Along with it abc scripts are also run and OpenSTA is also run.

run_synthesis

After running systhesis logs, reports and results are created.

The report folder have the following files:

  1. 1-yosys_4.chk.rpt
  2. 1-yosys_4.stat.rpt
  3. 1-yosys_dff.stat
  4. 1-yosys_pre.stat
  5. 2-opensta.min_max.rpt
  6. 2-opensta.rpt
  7. 2-opensta.slew.rpt
  8. 2-opensta.timing.rpt
  9. 2-opensta_tns.rpt
  10. 2-opensta_wns.rpt

Also a netlis file is created in the results --> symthesis folder named picorv32a.synthesis.v

TASK 1: Finding the d flip flop ratio

Count of d flip flop (sky130_fd_sc_hd_dfxtp_2) = 1613

image

Number of cells = 14876

image

flop ratio = count of d flip flops / number of cells = 1613/14876 = 0.108429 (10.8429 %)

The synthesis statisttics report is as follows:

image

Day 2 - Good floorplan vs bad floorplan and introduction to library cells

Chip Floor planning

Here we try to come up with the width and height of the chip.

Utilization factor and aspect ratio

  1. Determining width and height of the core and die While defining the dimensions of the chip we are mostly dependent on the dimensions of the logic gates (standard cells) sitting in the netlist.
  2. Core is the section where fundamental logic is being place whereas a die is a small semiconductor material specimen on which the fundamental circuit is fabricated and it consists of core.
  3. Once the logic is placed in the core it utilizes certain amount of core which is characterised by utilization factor (Area occupied by netlist / total area of the core). If utilization factor = 1 it means 100% utilization, hence no extra cells could be added. Therefore in a practical scenario the core utilization factor is always less than 1. Hence we generally go for 50-60% utilization. utilization factor = 0.5-0.6)
  4. Another important consideration is aspec ratio ((height)/(width) of the core). If aspect ratio = 1 it means the chip is square in nature.

Concept of pre-placed cells

Generally the utilization factor is less than 1, hence we have some un used section of core. These unused section of the core is used for optimization and other things. In the unused section we place additional cells, used for routing, etc.

Pre-placed cells - It is based on the concept of reusable modules or IP's. These blocks need not be implemented everytime we need to use it, these blocks are functionally implemented sometime in the past i.e., only once (for eg - memory, clock-gating, vomparator, mux,etc). These cells are called pre-placed ceels. It's needed to define the placement of these cells or IP's in a chip before routing. These have fixed places on the chip defined by the user. Since these cells are placed before placement and routing hence these are called Pre-placed cells. The pre-placed cells are being placed on a core depending on the design scenario. Automated placement and routing tools does not touch these cell positions.

image

De-coupling capacitors

The pre-placed cells needs to be surrounded by De-coupling capacitors. Whenever there is a switching in a circuit there is amount of current required because basically there is small capacitor present at each node, so (switching from 0 to 1) means the capacitor has to charge to represent logic 1 and the amount of charge is sent from the supply voltage. So, it is responsiblity of the supply voltage to supply the amount of current to the switching logic. Also while 1 to 0 switching it is the responsibility of VSS to take all the charge hence thre is discharge current. Now due to wire resistance, inductance there is a drop in the voltage across the wire during flow of current. There is a drop because these wires are vias hence it has physical dimensions hence it has res, cap, inductance etc. Now due to the drop there is VDD' at the node i.e., instead of 1V (VDD) we have 0.7V (VDD') The 0.7 V (VDD') should be in our noise margin range. So, if the VDD` lies in the noise margin range then we are safe but sometimes we can be unsafe. To ensure the safety we use decoupling caps, these are huge caps filled with charge and the equivalent voltage across the caps are same as supply voltage. Hence the required current is provided by this caps. This caps de-couples the circuit from the main supply whenever there is switching. Now we have taken care of local communication. Next we need to focus on global communication.

image image

Power planning

There a lot of macros on a chip and if each macro has it's current demand so there will be lot of de-coupling capacitor which is not feasible. Therefore, some critical blocks are decoupled using de-coupling cap but not for each element. There is always a posibility of voltage drop at certain node in the circuit. Suppose there is a 16 caps connected in parallel and all are going to switch from logic 1 to logic 0 i.e., caps are discharging and all this connected to same ground. All of these caps discharge at same time then there is a ground bounce (bump at the ground) and if the size of the bump exceeds the noise margin then it might enter into undefined state. Also when all the caps try to charge from 0 to 1 then there is a voltage droop (demanding of power at the same time) and again the noise margin concept applies here too. This problem happens only because there is power supply at one point, if there is power supply present in the entire perpherry then this problem is resolved. Hence solution to above problem is multiple power supply and ground. Therefore we have multiple VDD and VSS ports.

image

Pin Placement and logical cell placement blockage

The connectivity information betweer the gates is codded using VHDL/Verilog language and is called as the netlist.

There are pre-placed cells already present in the core. The area between the core and die are filled with the input and output ports. The I/O ports placement depends upon the cells connected to these ports as well as the pre-place cells. The clock ports are bigger in size as these are continously driven pins and these drives the complete chip. So we need the least resistance path for the clocks hence clock pins are thickers.

image

After pin placement it should be made sure that the remaining empty area between the core and the die is blocked. Therefore logical cell placement bloackage is done. Hence it ensures that automated placement and routing does not place anything in this area.

image

Voila!! we are done with Floor and Power Planning.

NOTE: Standard cell placement happens in placement stage.

Placement and routing

  • Netlist binding and initial place design
  1. Lets suppose the shape of the gate determine the functionality of the gate but in reality each gate is a black box. Hence we take each gate and give them a physical dimensions. This is done for each of the component of the netlist. These cells are present in a library which consists the following information and files shape and size, delays, various flavour of the cells and the timing information. Library akso orvide options with different delay and sizes. Particular cell is chosen as per our requirement.

image

  1. Next step is to take the particular shap and sizes and then place it on the floorplan. The pre-placed cells are not disturbed. During placement logical connectivity is maintained as well as the placement is done in such a way that optimized path is formed (i.e., blocks placed close to there input and output)

  2. Till now we have kept o/p port near the output and input port near the input. Now using some estimations we will try to do optimized placement. We can try to estimate the capacitances and resistance b/w two point. The wirelength will form a resistanace which will cause unnecessary voltage drop and a capacitance which will cause a slew rate that might not be permissible for fast current switching of logic gates. Successfully transmitting the signal from one place to another without any loss is known as signal eintegrity. To maintain the signal integrity we require repeaters (kind of buffers) and these are inserted as per the wire length and capacitance and based on these cap and resistance, a waveform is genrated and the transition of the waveform should be in permisible range. But now we have loss of area. Hence where integrity is maintained there we do not place any repeaters but if integrity not maintained the we insert the buffer (repeater). We need to come to a conclusion with minimum number of repeater. Sometime we also do abutment where logic cells are placed very close to each other (almost zero delay) if it has to run at high frequency (2GHz). Crisscrossing of routes is a normal for PnR and it can be avoided by use separate metal layer (using vias) for crisscrossed path.

    Based on ideal condition of the clock (time required by clock to reach a component is 0) we will do setup timing analysis and based on this we will check our placement condition is meeting the given specification or not.

    Placement in OpenLANE is done in two stages:

  • Global Placement - It's main job is to reduce the wire length. It is generally a coarse placement. Here no leaglization happens. Here the concept of HPWL (Half Parameter Wirelength) reduction model.

  • Detailed Placement - legalization happens here the std. cells are placed in std cell rows. {legalization - They shoulde be exactly inside the row and the should be abutted on each other and there should be no overlap}.

The main ain of placement now is congestion, it is not the timing analysis. The next step will be CTS.

Placement before buffer insertion:

image

Placement with buffers

image

Our objective is to converge the value of Overflow (it is present below HPWL value during run_placement. If the value of overflow decreases our design will converge. Now we can see the generayed .def file in the placement folder under results using the Magic tool.

**NOTE: ** Collection of gates in an area is called as library.

Cell design and characterization flows

In IC design flow a library is a place where we keep all our standard cells, buffers, decap cell, etc. The library does not only have different cells with different functionality but it also have same cell with different sizes, threshold voltage, delays, etc.

  • Examining an inverter

Cell design flow is divided into three parts: inputs,design steps and outputs.

  1. Inputs - Inputs to design an inverter is basically the PDKs which consists of DRC & LVS rules, SPICE models, library and user-defined specs.
  • DRC & LVS Rules - These are the technology rules defined by the foundary. tech files and poly subtrate paramters (CUSTOME LAYOUT COURSE)

  • SPICE Models - These consists of all the parameters based on the foundary for eg: Threshold voltage, linear regions, saturation region equations with added foundry parameters. Including NMOS and PMOS parameteres (Ciruit Deisgn and Spice simulation Course)

  • User defined Spec = These are the specifications given by the user which is to be achieved by following the DRC and LVS rules. Maintaining Cell height (separation between power and ground rail), Cell width (depends on drive strength), supply voltage(provided by top level, keep noise margin in check), metal layer requirement (which metal layer the cell needs to work), pin location, drawn gate length, etc.

Now we have all the inputs with us (available with the library developers). Now the developer should take the input and come up with std cells that adheres to these specs and rules.

  1. Design steps - It has three different steps: circuit design,layout design and characterisation.
  • circuit design includes the implentation of the logic and the modeling W/l ratio of NMOS and PMOS.
  • Implementation of the circuit description language (i.e., output of the circuit design) is called layout design.
 Steps in layout design:
 1. Get the function implemented using CMOS.
 2. Get a PMOS and NMOS network graph out of the implemented circuit. (using Euler's method)
 3. Obtain the Euler's path, it is a path that is traced only once.
 4. Draw the stick diagram.
 5. Convert the stick diagram into layout adhering with the DRC rules given by the foundary. 
 
 The layout is generated using Magic tool .Now we have the cell width and cell height and all the cells adhere to the rules.

We can extract the parasitics from the layout and we have characterise it with respect to timing.

  1. Outputs - The output of the layout desgn is GDSII. Lef defines the width and height of the cells. It also gives extracted spice netlist.

Now we will do chararacterisation and we will generate timing, noise, power libs function. Here we will try to understande various syntax and symantic of timing.lib, power.lib and noise.lib. These syntax are important to understand the GUNA software i.e., the characterisation software because software works on these variables and these are the variables present with us in order to feed into the software.

  • Timing charaterisation

image

Here we first understand different threshold point of waveform itself called as timing threshold defination.

Consider the above two inverter figure and understand the graph below. red curve - input to the circuit at 2nd inverter, blue curve - output of the circuit after 2nd inverter. We have slew deniation shown in the figure below for both rising and falling edge. With help of all the timing threshold defination we are able to calculate our slew as well as the propagation delay.

image

Similarly we have threshold for the delays (rise and fall) as we had for slew hence we analyse the waveform for the delays.

image

Getting a negative propagation delay is highly unexpected. A negative propagation delay means that the output comes before the input. Hence to avoid negative propagation delay we as designer need to choose correct threshold points which eventually leads to positive delays. Propagation delay threshold is usually 50% and slew rate threshold is 20-80%.

LAB Day 2

Continuation after synthesis.

Step 1: Running floorplan

  • We have lot of switches with which we adjust the flow directory. These switches are used to set certain parameter in each stage of the flow. For eg: In the Floorplanning stage we have FP_CORE_UTIL {for utilization percentage}, FP_ASPECT_RATIO {sets the aspect ratio}, FP_CORE_MARGINS {offset b/w die boundary and core boundary}, etc. We have certain .tcl file in OpenLane which has these switchs that sets these specifications.
├── README.md      

├── checkers.tcl

├── cts.tcl

├── floorplan.tcl  

├── general.tcl

├── lvs.tcl

├── placement.tcl

├── routing.tcl

└── synthesis.tcl

floorplan.tcl contains the following default switchs

image

  • The command to run floorplan is:

run_floorplan

Step 2: Review floorplan files and steps to view floorplan

  • Reviewing files

Here basically the ceated files are being checkd using the log files presen in the log/floorpla/4-ioPlacer.log. In case it is not there we can check it using the Magic tool.

  • for floorplan the core utilization is 50%

image

  • for config.tcl file under the runs folder core utilization is 35%.

image

  • setting the core utilization, verticle and horizontal metal layer by add these three switchs in the config.tcl file
set ::env(FP_CORE_UTIL) 65
set ::env(FP_IO_VMETAL) 4
set ::env(FP_IO_HMETAL) 3
  • Viewing Floorplan

The def (design exchange format) file is created in the floorplan folder of the results folder under runs folder. This file has the information about the die area. This gives the co-ordinates of the die and the unit is databse unit per micron i.e. 1 micron = 1000 database units)

image

The die co-ordinates and other information can be viewed using following command invoked under the picorv32a folder.

cd runs/[date]/results/floorplan/picorv32a.floorplan.def

NOTE: 1 micron is equivalent to 1000 database units

TASK 2: Calculating area

Calculating the die area = (660685 / 1000) x (671405/1000) = 443587.2124 um 2

Step 3: Review floorplan layout in Magic

  • Using Magic tool to view the def file

The following command can be used to invoke magic tool as well as open the def file:

magic -T /home/nickson/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.lef def read picorv32a.floorplan.def &f

To center the view, press "s" to select whole die then press "v" to center the view. Point the cursor to a cell then press "s" to select it, zoom into it by pressing 'z". Type "what" in tkcon to display information of selected object. These objects might be IO pin, decap cell, or well taps as shown below.

The genrated file is shown below:

image

The horizontal and verticle pins:

image

The decap cells:

gnome-shell-screenshot-45vknb

The diagonally equidistant tapcells:

image

The standard cells in the bottom left corner:

image

Step 4: Running Placement

The following command is used to run placement.

run_placement

During place a number of tools such as RePlace tool (for global placement), Resier tool (for optimization) and OpenDP (for detailed placement) is invoked. If the value of overflow converges then the design is legal.

  • Using Magic tool to see the layout of this stage.
magic -T /home/nickson/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.lef def read picorv32a.placement.def &

The genrated layout:

image

Placement ensures that the standard cells are correctly placed. PDN is created during floorplan. But is Openlane there is a post floorplan, post placement and CTS is done for PDN.

Day 3 - Design library cell using Magic Layout and ngspice characterization

LAB Day 3 (Part 1)

Labs for CMOS inverter ngspice simulations

Here we will be dive deep into the flow. We will take a .mag file and do post-layout simulation in ngspice. After post-characterising we will be plugging this cell into the openlane flow i.e., into picorv32a core.

  • IO Placer revision Earlier we had equidistant placed input/output pins. Now lets say we want to change it to some othe input/output pin statergy (there are four statergies supported by IO Placer - the tool that we use for IO placement). So we can change the switch (variable) and this will change the statergy and this can be done directly by setting the variable through the terminal and then re-run the floorplan.
EXAMPLE - changing PIN configuration
set ::env(FP_IO_MODE) 2;
run_floorplan

Then check the layout by launching Magic again.
  • SPICE deck creation for CMOS inverter

Here we will do SPICE simulation and deriving the charactestic on real time MOSFETs.

Creating SPICE deck

  1. SPICE Deck - It is a connectivity information about a cell. It is a netlist. It has the inputs, tap points, etc.

  2. We need to define the component parameter i.e., value for PMS and NMOS. For us value of W/L of PMOS M1 (0.375u/o.25u) and NMOS M2 (0.375u/0.25). Ideally PMOS should be 2 or 3 times wider than NMOS. The load cap is assumed to be 10 fF.

  3. We assume an input supply voltage value (GATE) as 2.5 V and main supply voltage (at drain) as 2.5 V. Generally the supply voltage (GATE) is multiple of length.

  4. Now we need to identify the node (those two point in b/w there is a component) and name these nodes.

The SPICE Deck is written below:

*** MODEL Description ***
*** NETLIST Description ***
M1 out in vdd vdd pmos W=o.375 L=0.25 *** [component name] [connectivity] [drain] [gate] [source] [substrate] [type] [dimensions W/L] ***
*** Similarly for NMOS ***
M2 out in vdd vdd nmos W=o.375 L=0.25
*** load cap connecivity and value [name] [node1] [node2] [value] ***
cload out 0 10f
*** Supply voltage [name] [node1] [node2] [value] ***
Vdd vdd 0 2.5
*** Input voltage [name] [node1] [node2] [value] ***
Vin in 0 2.5
*** Simulation Command ***
.op
.dc Vin 0 2.5 0.05 *** Sweeping gate input form 0 to 2.5 at steeps of 0.05  VTC curve***
*** describe the model file ***
.LIB "tsmc_025ummodel.mod" CMOS_MODELS
.end

image

First invoke the ngspice and then run the following command to simulate:

source [filename].cir
run
setplot 
dc1 
plot out vs in 

Analysing the inverter

  • Vm (switching threshold voltage) - The point where exact transition takes place i.e., Vin = Vout. At this point both the MOS are in saturation and we have a high leakage current (direct current flowing from vdd to ground). If the pull up network is strong the VTC moves towards right (Vm' > Vm) and if pull down network is strong then VTC shifts leftwards (Vm' < Vm).

Formula for Vm

image

  • Propagation delay - The difference between the time when output as well as input is at 50%. ( o/p falls and i/p rises gives fall delay, o/p rises and i/p falls gives us the rise delay)

  • We can furter do transient analysis.

LAB SETUP

  • We will first git clone one of the repo (it is custom made for the workshop). Here we have .mag file for INVERTER, model file for sky130nm PMOS and NMOS. We will creat a ful view cell and then we will plug it into our flow.

The command for git clone is (run it while you are in the openlane directory):

git clone https://github.com/nickson-jose/vsdstdcelldesign.git

image

It will create vsdstdcelldesign design folder.

We need to have the tech file to open the mag file. We will copy the tech file to our directory. the tech file is present in the sky130A which is inside pdks folder. The tech file is present at this location work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic.

cd work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/

To copy go to the location and then type the command given below with target location

cp sky130A.tech /[target location]

target location for our case - /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/vsdstdcelldesign

Now invoke magic tool in the vsdstdcelldesign folder to see the mag file i.e., layout of the inverter.

Command

magic -T [tech file] [.mag file]
tech file = sky130A.tech .mag file = sky130_inv.mag

The generated layout:

image

Inception of Layout  CMOS fabrication process (16 mask process)

  1. Selecting a substrate
  2. Creating active region for transistor
  3. N-Well and P-Well Fabrication
  4. Formation of Gate
  5. Lightly Doped Drain formation
  6. Source and Drain Formation
  7. Form Contacts and Interconnects
  8. Higher Level Metal Formation

LAB DAY 3 (PART 2)

Step 1 : Characterisation We try to analyse the layout part by part using the what command in tkcon window.

image

lef (library exchange format) - it has all the information about metal layers. It also protect the IP.

def (design exchange format)

To implement the complete CMOS inverter click here.

Magic is interactive DRC tool. If we have DRC it will show automatically. For our design we do not have any DRC violation. We need to ensure that our final design in DRC clean.

To know the logical function of the inverter we first extract the SPICE. Post that we will do simulation the file using ngspice.

To extract it on SPICE, type these command in the tkcon window:

  • create an .ext file - extract all (extracted in vsdstdcelldesign folder)
  • We will use this ext file to buide our SPICE file which can be used with the ngspice tool. Doing this will extract all the parasitics too.

image

ext2spice cthresh 0 rthresh 0 
ext2spice
  • seeing whats inside the spice file.
vim sky130_inv.spice

image

So our stand cell and extracted SPICE model is now present with us.

LAB DAY 3 (PART 3)

The above SPICE model give connectivity information of our inverter. Now for transient analysis we have to define the connections.

  • We want VGND to be connected to VSS
  • We want supply voltage (VDD) to be connected form VPWR to VSS (ground).
  • So we create a node 0 and give VDD = 3V.
  • Then we give pulse voltage between A and VGND (VSS).

We need to ensure that scaling is proper. (set to grid value specified in the layout). We can check the dimension of a grid in layout by the command box.

Add these command to our SPICE DECK

.option scale = 0.1u //set scale to 0.01 u.
.include ./libs/pshort.lib
.include ./libs/nshort.lib
// comment out the .subckt line
// change name of model from pmos (sky130_fd_pr__pfet_01v8) to pshort_model and from nmos(sky130_fd_pr__nfet_01v8) to nshort_model
VDD VPWR 0 3.3V //supply
VSS VGND 0 0V //ground
Va A VGND PULSE (0V 3.3V 0ns 0.1ns 0.1ns 2ns 4ns)
// comment .end
//transient 
.tran 1n 20n
.control
run
.endc
.end

error image

It gave us error that subckt hence I referred to the following link for exact SPICE file here.

* SPICE3 file created from sky130_inv.ext - technology: sky130A

.option scale=0.01u
.include ./libs/pshort.lib
.include ./libs/nshort.lib

* .subckt sky130_inv A Y VPWR VGND
M0 Y A VGND VGND nshort_model.0 ad=1435 pd=152 as=1365 ps=148 w=35 l=23
M1 Y A VPWR VPWR pshort_model.0 ad=1443 pd=152 as=1517 ps=156 w=37 l=23
C0 A VPWR 0.08fF
C1 Y VPWR 0.08fF
C2 A Y 0.02fF
C3 Y VGND 2fF
C4 VPWR VGND 0.74fF
* .ends

* Power supply 
VDD VPWR 0 3.3V 
VSS VGND 0 0V 

* Input Signal
Va A VGND PULSE(0V 3.3V 0 0.1ns 0.1ns 2ns 4ns)

* Simulation Control
.tran 1n 20n
.control
run
.endc
.end

After editing we launch the ngspice to see the values:

command

ngspic [spice file] // our case sky130_inv.spice

image

We can now see the plots (inside ngspice type the command below:

plot y vs time a

The transient plot is shown below:

image

Characterisation involves four parameters:

  1. rise transiton - time taken by output waveform to transit from 20% to 80% of VDD 20% value (0.66) = 2.1829 ns

image

80% value (2.64) = 2.24407 ns

image

Hence rise time = 2.24407 - 2.1829 = 0.06117 ns

TASK 3: calculating delays and fall time

  1. fall transition - time taken by output waveform to transit from 80% (2.64) to 20% (0.66) of VDD.

image

fall time = 0.02725 ns

3 & 4. Propagation delay - The difference between the time when output as well as input is at 50% (1.65). ( o/p falls and i/p rises gives fall delay, o/p rises and i/p falls gives us the rise delay)

  • fall delay:

output falling (50%)

image

input rising (50%)

image

Therefore delay = 8.07761 - 8.05075 = 0.02686 ns

  • rise delay:

output rising (50%)

image

input falling (50%)

image

Therefore delay = 6.15075 - 6.15 = 0.00075 ns

The above characterisation is done at 27 C.

Next objective is to use this layout of inverter to create a lef file. Using this lef in openlane and plugging this cell we will make a custom cell. We will plug this in picorv32a.

Step 2: DRC rules analysis

To know more about Magic and the command for DRC visit the following link. Technology files have all the technology related file. It consists all information about the layer, pattern, electrical connectivity, GDS generation rule, DRC rule, all other kind of rules, etc. Tnformation about the technology files can be found here.

NOTE: cif - caltech intermediate formate - It is used interchangably with gds in magic tech file and documentation. Read through the website for DRC rules. The basic DRC rules are called edge based rules.

  1. We will download the required DRC_test files using the command.
wget http://opencircuitdesign.com/open_pdks/archive/drc_test.tgz

Upon extraction we find that there are .mag files and sky130A.tech file.

image

Now we can use magic to analyse the DRC rule and fix it if it's violated.

DAY 4 Pre-layout timing analysis and importance of good clock tree

LAB DAY 4 (PART 1)

Pre-layout timing analysis and importance of good clock tree

OpenLANE is a Place and Route flow and for placement of any cell we do not require the entire mag file information. .mag file has all the information the power, ground, logic, metal, etc. For PnR do not require all the information. Only info we need is the PR boundary, power rail, ground rail, input and output. Here lef files come into picture. lef files has only these information. It protects our IP.

  • Now our objective is to extract the lef file from .mag file and then try to plug the lef file to picorv32a. (that is instead of std cell we will use our own design)

  • Guidlines for std cell set making from PnR point of view.

  1. The input and the output port must lie on the intersection of verticle and horizontal tracks.
  2. Width of standard cell must be in odd multiple of track pitch and height should be off odd multiple of track verticle pitch.

Step 1 requirement go to the directory - pdks/sky130A/libs.tech/openlane/sky130_fd_sc_hd/ then do less tracks.info

tracks are used during routing. route can usually go above the track which are the layers. So, route are basically metal traces. PnR is automated process so we need to specify where do we want our route can go and this information is given by the tracks. Hence tracks are guide to route. Horizontal and verticle track pitches are mentioned.

We now werify the guideline using magice. Pressing g make the grid visible. We will converge the grid with track value so that we can verify that our ports are actually on the intesection of horizontal and verticle li1 or not. So we try to take track file as reference and verify our file by getting grid information from tkcon window.

From track file we can get x pitch, y pitch, verticle offset and horizontal offset. Let's make a grid according to the track information.

command inside vsdstdcelldesign

magic -T sky130A.tech sky130_inv.mag &

then open the track information in the pdk -> sky130A -> libs.tech ->openlane -> sky130_fd_sc_hd/

less tracks.info

image

Horizontal track pitch = 0.46, verticle track pitch = 0.34, horizontal offset = 0.23, verticle offset = 0.17

image

We can observe horizontal and verticle crossing also we can observe the grid spacing is changed.

image

Step 2nd requirement

The width of the std cell in x direction (x pitch) should be odd multiple of the x pitch and height of the std cell y direction should be odd multiple of the y pitch. We find that the grids are as per our conditions.

**Step 3 LEF file extraction ***

Ports doesn't mean anything to magic. Port definations are required while we want to extract the lef files. After extraction ports are converted in pins. The LEF file contains the cell size, port definitions, and properties which aid the placer and router tool. With that, the ports definition, port class, and port use must be set first. The instructions to set these definitions via Magic are on the vsdstdcelldesign repo.

Once we have defined the ports. Our next step is to define the purpose of the ports. For that we do port class and port use. Refer to vsdstdcelldesign repo.

After setting the parameters we are ready to extract lef file from our mag. We give the cell a custom name - save sky130_vsdinv.mag (command in the tkcon tab).

image

Then open our new inverter mag.

magic -T sky130A.tech sky130_vsdinv.mag

Command to create the lef file (run in tkcon window)

lef write [name optional]

The lef file consists all the information.

image

Setting a layer as port create a PIN in the macro. Now our lef file is ready. Next we will try to plug this to our design of picorv32a.

image

Step 4 Introduction to timing libs and steps to include new cell in synthesis

  • First copy the newly created lef to src under picorv32a:
cp sky130_vsdinv.lef /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/src

We need to have a library which has our cell defination for synthesis so that abc can map it. (inside vsdstdcelldesign -> libs). We have different library file for different PVT and of different speed.

  • We will require fast slow and typical for STA analysis.

  • Now our objective is that the tool should map the vsd cell during the synthesis flow. We will copy the library (from vsdstdcelldesign -> libs) files to src folder under picorv32a.

cp sky130_fd_sc_hd__*  /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/src
  • Then we need to model our config.tcl under the picorv32a. The edited config.tcl is shown below:
# Design
set ::env(DESIGN_NAME) "picorv32a"

set ::env(VERILOG_FILES) "./designs/picorv32a/src/picorv32a.v"
set ::env(SDC_FILE) "./designs/picorv32a/src/picorv32a.sdc"

set ::env(CLOCK_PERIOD) "5.000"
set ::env(CLOCK_PORT) "clk"
set ::env(CLOCK_NET) $::env(CLOCK_PORT)

set ::env(LIB_SYNTH) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__typical.lib"
set ::env(LIB_MIN) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__fast.lib"
set ::env(LIB_MAX) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__slow.lib"
set ::env(LIB_TYPICAL) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__typical.lib"

set ::env(EXTRA_LEFS) [glob $::env(OPENLANE_ROOT)/designs/$::env(DESIGN_NAME)/src/*.lef]

set ::env(FP_CORE_UTIL) 65
set ::env(FP_IO_VMETAL) 4
set ::env(FP_IO_HMETAL) 3


set filename $::env(OPENLANE_ROOT)/designs/$::env(DESIGN_NAME)/$::env(PDK)_$::env(STD_CELL_LIBRARY)_config.tcl
if { [file exists $filename] == 1} {
source $filename
}
  • After editing the config file run the full flow from start i.e., in terminal opened on desktop. The set of commands are given below:
1. cd work/tools/openlane_working_dir/openlane
2. docker
3.  ./flow.tcl -interactive
4. package require openlane 0.9
5. prep -design picorv32a -tag [file_name (26-01_21-38)] -overwrite

error image

To resolve - Just change LIB_MIN to LIB_FASTEST and LIB_MAX to LIB_SLOWEST.

Hence the correct config.tcl file.

# Design
set ::env(DESIGN_NAME) "picorv32a"

set ::env(VERILOG_FILES) "./designs/picorv32a/src/picorv32a.v"
set ::env(SDC_FILE) "./designs/picorv32a/src/picorv32a.sdc"

set ::env(CLOCK_PERIOD) "5.000"
set ::env(CLOCK_PORT) "clk"
set ::env(CLOCK_NET) $::env(CLOCK_PORT)

set ::env(LIB_SYNTH) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__typical.lib"
set ::env(LIB_FASTEST) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__fast.lib"
set ::env(LIB_SLOWEST) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__slow.lib"
set ::env(LIB_TYPICAL) "$::env(OPENLANE_ROOT)/designs/picorv32a/src/sky130_fd_sc_hd__typical.lib"

set ::env(EXTRA_LEFS) [glob $::env(OPENLANE_ROOT)/designs/$::env(DESIGN_NAME)/src/*.lef]

set ::env(FP_CORE_UTIL) 65
set ::env(FP_IO_VMETAL) 4
set ::env(FP_IO_HMETAL) 3


set filename $::env(OPENLANE_ROOT)/designs/$::env(DESIGN_NAME)/$::env(PDK)_$::env(STD_CELL_LIBRARY)_config.tcl
if { [file exists $filename] == 1} {
source $filename
}
  • Then again run the set of abve commands by opening a terminal on desktop.

image

Now run the below command to add additional lefs.

set lefs [glob $::env(DESIGN_DIR)/src/*.lef]
add_lefs -src $lefs

image

  • Then run synthesis
run_synthesis

After this we can find that our inverter is being used here, see the image attached below:

image

image

Delay Table:

Problem:

  1. The capacitance or the load at the output node of each and every buffer in the complete clock tree is varying.

  2. Also if the load is varying the input transition is varying.

To avoid large skew between endpoints of a clock tree (happening due to signal arrives at different point in time):

  • After splitting the buffers.

  • Buffers on the same level must have same capacitive load to ensure same timing delay or latency on the same level. It means that each buffer at the same level is having same load.

  • Buffers on the same level must also be the same size (different buffer sizes -> different W/L ratio -> different resistance -> different RC constant -> different delay). It means that the buffer at same level should be of same size.

image

Solution:

Delay tables are the solution. Delay tables are 2D table. Delay of a component is characterised and summaries in a table.

The timing model of each cell is recorded and is summarised in delay tables, which are part of the liberty file. The output slew is the main cause of delay. Capacitive load and input slew are also factors that affect output slew. The input slew has its own transition delay table and is a function of the previous buffer's output cap load and input slew.

image

LAB DAY 4 (PART 2)

Step 1 Lab steps to configure synthesis settings to fix slack and include vsdinv

Let's try to fix the slack. Currently the value of slack is

tns (total negative slack) = -711.59
wns (worst negative slack) = -23.89

Slack has to be positive always and negative slack indicates a violation in timing. We will try to maintain a balance between the delay and the area by changing the variables such SYNTH_STRATEGY (to change the stratergy), SYNTH_BUFFERING (it adds buffer to reduce wire delays) and SYNT_SIZING

We can use the following command to know any variable (switches)

echo $::env ([Varible]) // our case = SYNTH_STRATEGY
// change the STRATEGY, Similarly change for buffering and sizing.

NOTE: We need to delete the the old synthesis (.v) file to change the slack while changing the attributes/variables/switches.

set ::env(SYNTH_STRATEGY) "DELAY 0"
set ::env(SYNTH_SIZING) 1

Earlier

image

After changing

image

The slack we get after detailed placement.

image

Next run floor plan by executing the following codes one by one:

init_floorplan
place_io
global_placement_or
detailed_placement
tap_decap_or
detailed_placement
gen_pdn
run_cts

image

image

image

Then check the file which is created. Go to the placements folder under reults and then invoke the magic tool and load the def file. The command is:

magic -T /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.lef def read picorv32a.placement.def

We can see our sky130_vsdinv file in the merged.lef file inside the tmp folder. The macro is present.

image

We can also see the sky130_vsdinv inside the layout also:

image

Timing Analysis

First we take the ideal clock (clock tree is not yet build) and do the timing analysis with it. After that we will do with real clock.

Pre-layout timing analysis (using ideal clock)

  • SETUP TIMING ANALYSIS. Specifications Clock frequency = 1GHz and period of 1ns.

We have a launch flop and capture flop and in between we the the combinational logic. We have ideal clock network i.e., clock tree is not yet built. Hence we do not have any buffer in the clock path. This is a typical scenario for hold time and setup time calculation. We send the 1st riseing clock to the launch flop (t=0ns) and the 2nd rising to the capture flop (t=1ns).

image

The equation for setup time is:

Θ < T - S - SU

First basic insight is the setup delay should be less than the combinational delay. Then analysing the capture flop we see some delay due to the mux. due to jitter there is delay in the exact point of clock arrival and this variation is due to internal clock circuitary (PLL).

where.

  • Θ = Combinational delay which includes clk to Q delay of launch flop and internal propagation delay of all gates between launch and capture flop
  • T = Time period, also called the required time
  • S = Setup time. As demonstrated below, signal must settle on the middle (input of Mux 2) before clock tansists to 1 so the delay due to Mux 1 must be considered, this delay is the setup time.
  • SU = Setup uncertainty due to jitter which is temporary variation of clock period. This is due to non-idealities of PLL/clock source.

NOTE: Things are different for hold time.

We have, T = 1000 ps, S = 10 ps, U = 90 ps Hence we arrive at Θ < 0.9 ns (for our case)

LAB DAY 4 (PART 3)

OpenSTA for post-synth timing analysis

In cts we try to change the netlist by making clock tree.

The below files can be found in th extras folder in vsdstdcelldesign.

Making the pre_sta.conf and save it in the openlane folder.

set_cmd_units -time ns -capacitance pF -current mA -voltage V -resistance kOhm -distance um
read_liberty -max /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/src/sky130_fd_sc_hd__slow.lib
read_liberty -min /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/src/sky130_fd_sc_hd__fast.lib
read_verilog /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/runs/29-01_18-06/results/synthesis/picorv32a.synthesis.v
link_design picorv32a
read_sdc /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/src/my_base.sdc
report_checks -path_delay min_max -fields {slew trans net cap input_pin}
report_tns
report_wns

After cts new .v files start getting created.

Creating my_base.sdc and save this file in the src folder of picorv32a folder.

set ::env(CLOCK_PORT) clk
set ::env(CLOCK_PERIOD) 12.000
set ::env(SYNTH_DRIVING_CELL) sky130_fd_sc_hd__inv_8
set ::env(SYNTH_DRIVING_CELL_PIN) Y
set ::env(SYNTH_CAP_LOAD) 17.65
create_clock [get_ports $::env(CLOCK_PORT)]  -name $::env(CLOCK_PORT)  -period $::env(CLOCK_PERIOD)
set IO_PCT  0.2
set input_delay_value [expr $::env(CLOCK_PERIOD) * $IO_PCT]
set output_delay_value [expr $::env(CLOCK_PERIOD) * $IO_PCT]
puts "\[INFO\]: Setting output delay to: $output_delay_value"
puts "\[INFO\]: Setting input delay to: $input_delay_value"


set clk_indx [lsearch [all_inputs] [get_port $::env(CLOCK_PORT)]]
#set rst_indx [lsearch [all_inputs] [get_port resetn]]
set all_inputs_wo_clk [lreplace [all_inputs] $clk_indx $clk_indx]
#set all_inputs_wo_clk_rst [lreplace $all_inputs_wo_clk $rst_indx $rst_indx]
set all_inputs_wo_clk_rst $all_inputs_wo_clk


# correct resetn
set_input_delay $input_delay_value  -clock [get_clocks $::env(CLOCK_PORT)] $all_inputs_wo_clk_rst
#set_input_delay 0.0 -clock [get_clocks $::env(CLOCK_PORT)] {resetn}
set_output_delay $output_delay_value  -clock [get_clocks $::env(CLOCK_PORT)] [all_outputs]

# TODO set this as parameter
set_driving_cell -lib_cell $::env(SYNTH_DRIVING_CELL) -pin $::env(SYNTH_DRIVING_CELL_PIN) [all_inputs]
set cap_load [expr $::env(SYNTH_CAP_LOAD) / 1000.0]
puts "\[INFO\]: Setting load to: $cap_load"
set_load  $cap_load [all_outputs]

This is replicating the same results as we had after run synthesis stage. pre_sta.conf will be the fill on which we will be doing our STA analysis.

To perform pre STA run the command below by opening the terminal in openlane folder which is inside the openlane_working_dir.

sta [file_name] // (our case = pre_sta.conf)

image

As we haven't done CTS hold time doesn't hold any significance. The delay of any cell is function of input slew and output load. We can play with these data and can get slack as positive. So we can also play with some of some of these parameters.

Command to check what a particular cell is driving:

report_net -connections _[cell number/net number]_

We will change the buffer value to and then try to find out the slack.

image

To replace the buffer (from buf 1 to buf 4) we use the following command.

replace_cell _23732_ sky130_fd_sc_hd__buf_4
  • report check will report the worst path, by default it is the max setup slack
report_checks -field {net cap slew input_pins} -digits 4

we can see that the slack is decreased by some value.

image

Upsizing the buffer will change the cell. We can replace cell to bring donw the slack. It will slightly increase the area,

1. First find the cell you want to replace. 
2. Then echo its net details by the following command 
report_net -connections _[cell number/net number]_ // net number = 23732 name = sky130_fd_sc__hd_buf_1
3. Then use the replace command to upsize it.
replace_cell _[net to be replaced]_ [new net name] // net to be replaced = 23732 new net name = sky130_fd_sc_hd__buf_4 

## Clock Tree Synthesis

There are three parameters that we need to consider when building a clock tree:

* Clock Skew = In order to have minimum skew between clock endpoints, clock tree is used. This results in equal wirelength (thus equal latency/delay) for every path of the clock.
* Clock Slew = Due to wire resistance and capacitance of the clock nets, there will be slew in signal at the clock endpoint where signal is not the same with the original input clock signal anymore. This can be solved by clock buffers. Clock buffer differs in regular cell buffers since clock buffers has equal rise and fall time.
* Crosstalk = Clock shielding prevents crosstalk to nearby nets by breaking the coupling capacitance between the victim (clock net) and aggresor (nets near the clock net), the shield might be connected to VDD or ground since those will not switch. Shileding can also be done on critical data nets.

LAB DAY 4 (PART 3)

After bringing the slack down to less than -1. For more detail about it refer to this repo. Sadly which didn't happen for me.

  • We can write the modifications in our cell during reduction of time into a new netlist which can be then used for the CTS. The command for this is:
write_verilog [location/picorv32a.synthesis.v] // my case = /home/ee22mtech14005/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/runs/29-01_18-06/results/synthesis/picorv32a.synthesis.v

Verifation about modification can be done by the help of cheffing the change of cell that one performs.

Now run floorplan again then placement. Use the following command in order one by one.

init_floorplan
place_io
global_placement_or
detailed_placement
tap_decap_or
detailed_placement
gen_pdn

Net stage is running CTS. It takes the default settings.

Variable Description
CTS_TARGET_SKEW The target clock skew in picoseconds.
(Default: 200 ps)
CTS_ROOT_BUFFER The name of cell inserted at the root of the clock tree.
CLOCK_TREE_SYNTH Enable clock tree synthesis for tirtonCTS.
(Default: 1)
CTS_TOLERANCE An integer value that represents a tradeoff of QoR and runtime. Higher values will produce smaller runtime but worse QoR
(Default: 100)

Then execute the following command for CTS.

run_cts

During CTS buffers are added. It generates the new file named picorv32a.synthesis_cts.v.

Diving deep in run_cts. There is concept of tcl proc. These are basically a proc and the defination of these proc are made somewhere in the flow. We will check the broc.

1. Go to Openlane folder.
2. Inside that go to scripts. Then go to tcl_commands.
3. There we can find the tcl files for each command. 
4. We can see the command inside it which runs the tools.
5. Inside the openroad folder we can get various other tcl files.
6. Pre floor plan in done in openroad and post floorplan and placement is done in openlane.

The generated def file in cts is then used for further stages. generated file picorv32a.cts.def

Inside CTS only Triton route runs. Max slew value is 10% of clock period. Max cap value is of output of root buffer for a typical cornor. OpenSTA is not inside the openlane flow.

Timing Analysis with Real Clocks

Setup and hold analysis with real clock will now include clock buffer delays:

  • In setup analysis, the point is that the data must arrive first before the clock rising edge to properly latch that data. Setup violation happens when path is slow. This is affected by parameters such as combinational delay, clock buffer delay, time period, setup time, and setup uncertainty (jitter).

  • Hold analysis is the delay that the MUX2 model inside the flip flop needs to move the data to outside. This is the time that the launch flop must hold the data before it reaches the capture flop. Hold analysis is done on the same rising clock edge for launch and capture flop unlike in setup analysis where it spans between two rising clock edges. Hold violation happens when path is too fast. This is affected by parameters such as combinational delay, clock buffer delays, and hold time. (time period and setup uncertainty does not matter since launch and capture flops will receive the same rising clock edges fo hold analysis)

  • We should have both setup and hold time as positive.

  • Openroad was an independent project which was later integreted in openlane. Openroad has OpenSTA integrated in it.

LAB DAY 4 (PART 4)

  • In the terminal in which we run the run_cts command there only go to openroad. Type the following command in the terminal.
openroad
  • This will open the open road. Our objective to do the analysis of the entire circut where clock tree has been build now. Now we will open OpenSTA here. For timing alnalysis.
  1. We first create a db `
  2. db is create using lef and def file. In our analysis we use these db. (It is a one time process. Whenever lef changes we have to change the db)
  • To create a db

All the loaction should be after /openLANE_flow/.....

// first read lef (it is inside the tmp folder (merged.lef)
read_lef [location] {my case = read_lef /openLANE_flow/designs/picorv32a/runs/29-01_18-06/tmp/merged.lef}

// secondly read def (it is present inside cts folder present under the results folder/cts)
read_def [location] {my case = /openLANE_flow/designs/picorv32a/runs/29-01_18-06/results/cts/picorv32a.cts.def}

// creating db
write_db [name] // my case = pico_cts.db (created under the openlane folder)

// reading db 
read_db [name] // my case = pico_cts.db

//  reading verilog (it is present inside cts folder present under the results/synthesis/picorv32a.synthesis_cts.v)
read_verilog [location] // {my case = /openLANE_flow/designs/picorv32a/runs/29-01_18-06/results/synthesis/picorv32a.synthesis_cts.v}

// reading library (max)
read_liberty -max $::env(LIB_FASTEST)

// reading library (min)
read_liberty -min $::env(LIB_SLOWEST)

// reading sdc
read_sdc [location] {my case = /openLANE/designs/picorv32a/src/my_base.sdc}

// now the clock has been generated 
set_propagated_clock [all_clocks]

// report
report_checks -path_delay min_max -format full_clock_expanded -digits 4

image

If our chip has hold violation we cannot compensate it but if we have setup violation then we can compensate it. CTS is followed by routing where actual metal layers are being layed then resistances and capacitancs come in picture. Hence, delay will be added due to the metal traces and it will increase the delay of data path.

hold slack = <!--- arrival - required = ---> = 1.6507

image

setup slack = 6.5014

image

The above is done for typical cornor but we are seeing it for minimum and maximum cornor hence the analysis is not correct.

to exit the openroad:

exit

So this time we will use the typical cornor. So do the same process from the read db.

openroad
read_db pico_cts.db
read_verilog /openLANE_flow/designs/picorv32a/runs/29-01_18-06/results/synthesis/picorv32a.synthesis_cts.v
read_liberty $::env(LIB_SYNTH_COMPLETE)
link_design picorv32a
read_sdc /openLANE_flow/designs/picorv32a/src/my_base.sdc
set_propagated_clock [all_clocks]
report_checks -path_delay min_max -format full_clock_expanded -digits 4

Slack for typical cornor

Hold slack = 0.0702

image

Setup slack = 4.1080

image

For typical cornor both the slack is met i.e., no violation. For max and min cornor we have to do it seperatly because multicornor is not supported.

The buffer which we have:

image

When the openlane is building the CTS, it is actually trying to met the skew value by inserting the buffers from left to right. We will always want the skew value skew value to be 10% of the maximum clock period.

top command is used to see all the process.

Day 5 Final step for RTL2GDS

THEORY

LAB DAY 5

  • The command to load the previous files (basically whatever you have done).
1. cd work/tools/openlane_working_dir/openlane
2. docker
3.  ./flow.tcl -interactive
4. package require openlane 0.9
5. prep -design picorv32a -tag 29-01_18-06
// if we include new configuration i.e., edit the config file then we have to do overwrite
prep -design picorv32a -tag 29-01_18-06 -overwrite 

// to check the last def file created i.e., last def
echo $::env(CURRENT_DEF)
  • Now we have to do power distribution network (it has to be done in the floorplan itself but as we missed it we will run it now. The creation of power and ground lines along with side line (std_cell rails)is done iby the pdn.
gen_pdn

image

Stdcell Rail, Straps and Macro:

image

Standard cell are to be placed between the rails. We should ensure that the height of standard cell should be in the multiples of 2.72, so that we can have both VDD and VSS for each of the standard cells.

The power distribution of the chipset is shown below. Power should be supplies from the verticle straps to the standard cell rails. Similarly power should go to the macro.

The tmp folder consists of all the def files of each stage.

pdn.def consists the cts,def plus it's own values.

  • Finally 'run_routing'

image

Before routing we will see the switches for routing, so that we can optimize the routing time (for purpose of workshop)

The command to run routing

run_routing

The entire routing is divided into two steps:

image

  • In global route the output is the a set of routing guides for each of the nets.

  • In detail route we use the global route and then we do connectivity between the points.

*The output of fast route is followed by the detailed route. So,that the detail route should ensure it need to realise the segment, vias in accordance to the global route.

ROUTING SUCCESSFUL

image

After routing we get number of violations, which can be verified by checking the the .drc file under routing/16-tritoRoute.drc. (We se a blank drc file).

  • Exctracting SPEC (SPEC extraction is done outside openlane as it does not have SPEC Extractor tool in openlane.

The .spef file can be found under the routing folder under the results folder.

image

The following command can be used to stream in the generated GDSII file.

run_magic

image

Now the gds file will be generated and it is stored in the magic folder under results folder.

image

Generated layout

image

REFERENCES

Kunal Ghosh - Co-founder of VSD Nickson Jose - Workshop Instructor OpenLANE-Sky130-Physical-Design-Workshop

Inquires

Abhishek Ranjan