动机
在 ISE的综合工具(synthesizer)(xst) 生成的报告底部的某处,它说明了最大频率是多少,以及最慢的路径(path)的轮廓。这是一个相当不错的功能,尤其是在尝试优化特定模块时。在综合和 Vivado之后没有给出这样的数字,可能是因为 Xilinx 的人认为这个“最大频率”可能会产生误导。如果是这样,他们有两个很好的理由:
- 没有“最大频率”这样的东西: 这些工具关注时序约束(timing constraints)并相应地做到最好。简而言之,除非您要求,否则您可能无法获得某个频率。
- 在典型的设计中,有许多不同频率的时钟。最慢的路径可能属于无论如何都很慢的时钟。
尽管如此,了解完整实现(implementation)的 Via Dolorosa 之前的情况有时还是很有用的。
如何在 Vivado中做到这一点
首先也是最重要的: 根据您的期望设置时序约束。或者至少,以某种方式明确哪些时钟是重要的,哪些可能很慢。然后在设计上执行综合。
综合成功完成后,打开 synthesized 设计(点击 bar 左侧的“Open Synthesized 设计”或用 Tcl 命令“open_run synth_1”)。
在 Tcl 窗口中,编写命令
report_timing_summary -file mytiming.rpt
它将完整的 post-synthesis timing 报告写入 mytiming.rpt。只需“report_timing_summary”将其打印到 console。
bar左侧的“Synthesized 设计”下还有一个“报告 Timing Summary(Report Timing Summary)”选项,但是我发现使用图形用户界面(GUI)界面很难从报告中获取信息。
阅读报告
规则 #1 : 综合报告(synthesis report)只是一个粗略的估计。布线时延(routing delays)是猜测。它可能会报告时序 failures (timing failures)在实现无论如何都会成功的地方,并且它可能会说在实现将严重失败的地方一切都很好(特别是当 FPGA的逻辑 usage (logic usage)接近 100%时)。
现在开始行动: 首先要看的是时钟 summary (clock summary)和 Intra Clock Table,并了解 Vivado 是如何命名哪个时钟的。例如,
------------------------------------------------------------------------------------------------ | Clock Summary | ------------- ------------------------------------------------------------------------------------------------ Clock Waveform(ns) Period(ns) Frequency(MHz) ----- ------------ ---------- -------------- clk_fpga_1 {0.000 5.000} 10.000 100.000 gclk {0.000 4.000} 8.000 125.000 audio_mclk_OBUF {0.000 41.667} 83.333 12.000 clk_fb {0.000 20.000} 40.000 25.000 vga_clk_ins/clk_fb {0.000 20.000} 40.000 25.000 vga_clk_ins/clkout0 {0.000 1.538} 3.077 325.000 vga_clk_ins/clkout1 {0.000 7.692} 15.385 65.000 vga_clk_ins/clkout2 {0.000 7.692} 15.385 65.000 ------------------------------------------------------------------------------------------------ | Intra Clock Table | ----------------- ------------------------------------------------------------------------------------------------ Clock WNS(ns) TNS(ns) TNS Failing Endpoints TNS Total Endpoints WHS(ns) THS(ns) THS Failing Endpoints THS Total Endpoints WPWS(ns) TPWS(ns) TPWS Failing Endpoints TPWS Total Endpoints ----- ------- ------- --------------------- ------------------- ------- ------- --------------------- ------------------- -------- -------- ---------------------- -------------------- clk_fpga_1 3.791 0.000 0 12474 0.135 0.000 0 12474 3.750 0.000 0 5021 gclk 6.751 0.000 0 2 audio_mclk_OBUF 76.667 0.000 0 1 clk_fb 12.633 0.000 0 2 vga_clk_ins/clk_fb 38.751 0.000 0 2 vga_clk_ins/clkout0 1.410 0.000 0 10 vga_clk_ins/clkout1 10.747 0.000 0 215 -0.029 -0.229 8 215 6.712 0.000 0 195 vga_clk_ins/clkout2 3.990 0.000 0 415 0.135 0.000 0 415 7.192 0.000 0 211
如果时钟 Summary 中列出的时钟频率(从时序约束派生)不能帮助匹配时钟和名称,则 Intra Clock Table 中每个时钟的 TNS Total Endpoints 有助于分辨哪个时钟是哪个。因此,一旦找到感兴趣的时钟的名称,在文件中搜索此名称,并找到如下内容:
Max Delay Paths -------------------------------------------------------------------------------------- Slack (MET) : 3.791ns (required time - arrival time) Source: xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_offset_limit_1/C (rising edge-triggered cell FDRE clocked by clk_fpga_1 {rise@0.000ns fall@5.000ns period=10.000ns}) Destination: xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_end_offset_0/D (rising edge-triggered cell FDRE clocked by clk_fpga_1 {rise@0.000ns fall@5.000ns period=10.000ns}) Path Group: clk_fpga_1 Path Type: Setup (Max at Slow Process Corner) Requirement: 10.000ns (clk_fpga_1 rise@10.000ns - clk_fpga_1 rise@0.000ns) Data Path Delay: 6.077ns (logic 2.346ns (38.605%) route 3.731ns (61.395%)) Logic Levels: 8 (CARRY4=3 LUT3=1 LUT4=1 LUT6=3) Clock Path Skew: -0.040ns (DCD - SCD + CPR) Destination Clock Delay (DCD): 0.851ns = ( 10.851 - 10.000 ) Source Clock Delay (SCD): 0.901ns Clock Pessimism Removal (CPR): 0.010ns Clock Uncertainty: 0.154ns ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE Total System Jitter (TSJ): 0.071ns Total Input Jitter (TIJ): 0.300ns Discrete Jitter (DJ): 0.000ns Phase Error (PE): 0.000ns Location Delay type Incr(ns) Path(ns) Netlist Resource(s) ------------------------------------------------------------------- ------------------- (clock clk_fpga_1 rise edge) 0.000 0.000 r PS7 0.000 0.000 r xillybus_ins/system_i/vivado_system_i/processing_system7_0/inst/PS7_i/FCLKCLK[1] net (fo=1, unplaced) 0.000 0.000 xillybus_ins/system_i/vivado_system_i/processing_system7_0/inst/n_707_PS7_i BUFG (Prop_bufg_I_O) 0.101 0.101 r xillybus_ins/system_i/vivado_system_i/processing_system7_0/inst/buffer_fclk_clk_1.FCLK_CLK_1_BUFG/O net (fo=5023, unplaced) 0.800 0.901 xillybus_ins/xillybus_core_ins/bus_clk_w r xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_offset_limit_1/C ------------------------------------------------------------------- ------------------- FDRE (Prop_fdre_C_Q) 0.496 1.397 f xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_offset_limit_1/Q net (fo=5, unplaced) 0.834 2.231 xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_offset_limit[1] LUT4 (Prop_lut4_I0_O) 0.289 2.520 r xillybus_ins/xillybus_core_ins/unitw_1_ins/Mcompar_n0037_lutdi/O net (fo=1, unplaced) 0.000 2.520 xillybus_ins/xillybus_core_ins/unitw_1_ins/Mcompar_n0037_lutdi CARRY4 (Prop_carry4_DI[0]_CO[3]) 0.553 3.073 r xillybus_ins/xillybus_core_ins/unitw_1_ins/Mcompar_n0037_cy[0]_CARRY4/CO[3] net (fo=1, unplaced) 0.000 3.073 xillybus_ins/xillybus_core_ins/unitw_1_ins/Mcompar_n0037_cy[3] CARRY4 (Prop_carry4_CI_CO[3]) 0.114 3.187 r xillybus_ins/xillybus_core_ins/unitw_1_ins/Mcompar_n0037_cy[4]_CARRY4/CO[3] net (fo=3, unplaced) 0.936 4.123 xillybus_ins/xillybus_core_ins/unitw_1_ins/Mcompar_n0037_cy[7] LUT6 (Prop_lut6_I4_O) 0.124 4.247 f xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_wr_request_condition/O net (fo=7, unplaced) 0.480 4.727 xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_wr_request_condition LUT3 (Prop_lut3_I2_O) 0.124 4.851 r xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_flush_condition_unitw_1_wr_request_condition_AND_179_o3_lut/O net (fo=1, unplaced) 0.000 4.851 xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_flush_condition_unitw_1_wr_request_condition_AND_179_o3_lut CARRY4 (Prop_carry4_S[2]_CO[3]) 0.398 5.249 f xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_flush_condition_unitw_1_wr_request_condition_AND_179_o2_cy_CARRY4/CO[3] net (fo=21, unplaced) 0.979 6.228 xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_flush_condition_unitw_1_wr_request_condition_AND_179_o LUT6 (Prop_lut6_I5_O) 0.124 6.352 r xillybus_ins/xillybus_core_ins/unitw_1_ins/_n03401/O net (fo=15, unplaced) 0.502 6.854 xillybus_ins/xillybus_core_ins/unitw_1_ins/_n0340 LUT6 (Prop_lut6_I5_O) 0.124 6.978 r xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_end_offset_0_rstpot/O net (fo=1, unplaced) 0.000 6.978 xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_end_offset_0_rstpot FDRE r xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_end_offset_0/D ------------------------------------------------------------------- ------------------- (clock clk_fpga_1 rise edge) 10.000 10.000 r PS7 0.000 10.000 r xillybus_ins/system_i/vivado_system_i/processing_system7_0/inst/PS7_i/FCLKCLK[1] net (fo=1, unplaced) 0.000 10.000 xillybus_ins/system_i/vivado_system_i/processing_system7_0/inst/n_707_PS7_i BUFG (Prop_bufg_I_O) 0.091 10.091 r xillybus_ins/system_i/vivado_system_i/processing_system7_0/inst/buffer_fclk_clk_1.FCLK_CLK_1_BUFG/O net (fo=5023, unplaced) 0.760 10.851 xillybus_ins/xillybus_core_ins/bus_clk_w r xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_end_offset_0/C clock pessimism 0.010 10.861 clock uncertainty -0.154 10.707 FDRE (Setup_fdre_C_D) 0.062 10.769 xillybus_ins/xillybus_core_ins/unitw_1_ins/unitw_1_end_offset_0 ------------------------------------------------------------------- required time 10.769 arrival time -6.978 ------------------------------------------------------------------- slack 3.791
这是一段相当混乱的文本,但关键元素用红色标记。
在得出任何结论之前,请确保它是您正在查看的正确部分:
- 这是 Max Delay 路径部分。 mimimal 路径部分可用于发现 hold time 违规,并且对最大频率没有影响。
- 这是正确的时钟。在上面的示例中,它是 clk_fpga_1。 Requirement 行不仅说明了为此时钟(10 ns = 100 MHz) 提供的约束(constraint),而且还说明了它从 clk_fpga_1 的一个上升沿(rising edge)到下一个上升沿。
完成后,让我们看看我们得到了什么: 要求是 10 ns, slack 是 3.791 ns (注意是正数)。这意味着我们可以要求比 3.791 ns更短的时钟周期(clock period),它仍然可以。所以请求的时钟周期可能是 10 – 3.791 = 6.2090 ns,大约是 161 MHz。
因此, clk_fpga_1 的“maximal 时钟”问题的简短答案是 161 MHz。但请记住,如果约束发生变化,这个数字可能会发生变化。
最后一点: 数据路径时延(Data Path Delay)告诉我们是什么让这台最糟糕的路径变慢或变快。时延(delay)在逻辑上花了多少,以及(估计的) route 时延上多少。下面详细介绍的时延报告(delay report)也是如此。如需更详细的报告,请考虑在请求时序报告(timing report)时使用“-noworst”标志,因此列出了一些最坏情况的路径。这可以帮助解决时序(timing)的问题。