# **Bilinear Filtering**

Recall that the blend equation was:

$$Cnew = Ca * f + Cb * (1-f)$$

Where Ca, Cb were two 8-bit colors, and Cnew was a blend of these two colors using the blend factor 'f' (a 9-bit value).

A similar operation is performed when a texture is mapped onto an object in 3D graphics, except that 2 blend factors and four colors are used:

 $T_{new} = (1-v)^*(1-u)^*T_{00} + (1-v)^*u^*T_{01} + v^*(1-u)^*T_{10} + u^*v^*T_{11}$ 

 $T_{00},T_{01},\,T_{01},\,T_{11}$  are 8-bit color values as before, with two 9-bit factors v, u used to determine  $T_{new}$  .

BR

1

3

3/26/2002

Bilinear Filtering (cont)Bilinear Filtering (cont)We will use 9-bits to represent 1.0 accurately.Sample calculations:u=1.0, v=1.0, then Tnew = T<sub>11</sub>u=0.0, v=1.0, then Tnew = T<sub>10</sub>u=0.0, v=0.0, then Tnew = T<sub>10</sub>u=0.0, v=0.0, then Tnew = T<sub>01</sub>u=0.0, v=0.0, then Tnew = T<sub>00</sub>u = 0.5, v=0.5 thenTnew = 0.25\*T<sub>00</sub> + 0.25\*T<sub>10</sub> + 0.25\*T<sub>10</sub> + 0.25\*T<sub>11</sub>

# The Problem

- Use Synopsys Behavioral Compiler to create three different implementations
  - Minimum resource implementation (1 adder, 1 multiplier), no-overlapped computations. New output is produced every ???? clock cycles??
  - 2 Multiplier implementation no overlapped computations. New output is produced every ????? clock cycles.
  - Overlapped computation implementation in which input bus is always busy and a new output is produced every 4 clock cycles.
- Will use '+','\*', *oneminus* operations from *dwdsp\_arith\_unsigned* and modules from *DWDSP.sl* synthetic library.
- Must have completed *DWDSP\_mult\_csa.vhd* from previous assignment.
  Only keep 8 most significant bits of each multiplication operation.

3/26/2002

BR

| bifilt Entity                                                                                                                                                                                                                                           |  |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| <pre>entity bifilt is    port ( clk,reset: std_logic;         din: in std_logic_vector(8 downto 0);         coeff_rdy: out std_logic;         irdy: out std_logic;         ordy: out std_logic;         dout: out std_logic_vector(7 downto 0) );</pre> |  |  |
| end bifilt;                                                                                                                                                                                                                                             |  |  |
| din – input bus for $u,v, Txx$ values                                                                                                                                                                                                                   |  |  |
| <i>coeff_rdy</i> asserted when input u,v after synchronous reset.                                                                                                                                                                                       |  |  |
| <i>irdy</i> asserted when ready for input of successive Txx value – T00, T01, T01, T11 on successive clock cycles.                                                                                                                                      |  |  |
| <i>ordy</i> asserted when <i>dout</i> has valid output value.<br><sup>3/26/2002</sup> BR 4                                                                                                                                                              |  |  |

| bifilt_behv.vhd                                                                                                                                | architecture – reset states     |    |  |
|------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|----|--|
| <pre>library ieee,dwdsp;<br/>use ieee.std_logic_1164.al<br/>use ieee.std_logic_arith.a<br/>use dwdsp.dwdsp_arith_ussi</pre>                    | ll;<br>gned.all;                |    |  |
| architecture behv of bifil<br>begin                                                                                                            | tis                             |    |  |
| main:process                                                                                                                                   |                                 |    |  |
| <pre>variable u, v: std_logic_vector(8 downto 0);</pre>                                                                                        |                                 |    |  |
| begin<br>reset_loop: loop                                                                                                                      | All handshaking lines negated o | 'n |  |
| ordy <= '0';                                                                                                                                   | reset.                          |    |  |
| <pre>irdy &lt;= '0';<br/>coeff_rdy &lt;= '0';<br/>wait until clk'event and clk = '1';<br/>if (reset = '1') then exit reset_loop; end if;</pre> |                                 |    |  |
| 3/26/2002                                                                                                                                      | BR                              | 5  |  |

| <i>bifilt_behv.vhd</i> architecture – <i>reset</i> states (cont)                                                                                                 |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| $u, y$ values input on successive clocks after assertion of $coeff_rdy$ .                                                                                        |  |  |
| <pre>coeff_rdy &lt;= 'l'; wait until clk'event and clk = 'l'; if (reset = /l') then exit reset_loop; end if;</pre>                                               |  |  |
| <pre>coeff_rdy &lt;= '0';<br/>u := din;<br/>wait until clk'event and clk = '1';</pre>                                                                            |  |  |
| <pre>if (reset / '1') then exit reset_loop; end if;<br/>v := din;<br/>wait until clk'event and clk = '1';<br/>if (reset - 1'1') then exit ment loop and if</pre> |  |  |
| <pre>if (reset = '1') then exit reset_loop; end if;<br/>irdy &lt;= '1';<br/>wait until clk'event and clk = '1';</pre>                                            |  |  |
| <pre>if (reset = '1') then exit reset_loop; end if;<br/>l1: loop sample loop</pre>                                                                               |  |  |
| 3/26/2002 BR 6                                                                                                                                                   |  |  |

*bifilt behv.vhd* architecture – *sample loop* 

11: loop -- sample loop
-- fill this in.
 end loop; -- L1
 end loop; -- reset\_loop;
end process;

Fill in the sample\_loop. Must input T00, T01, T10, T11 in successive super states (you can compute with a *Txx* value in the same super state in which you input the value).

For non-pipelined implementation, *irdy* and *ordy* must be negated in the first super state, and asserted in the super state in which the output is ready.

For pipelined implementation, *irdy, ordy* are never negated after its initial assertion in the reset states (new Txx value input every clock, new Tnew value every 4 clocks) <sup>3/26/2002</sup> BR

# bifilt\_test.zip Archive

This expands to a *bifilt\_est/* directory that provides a testbench for your bifilt implementations. Install this a modelsim library. Files are:

*bifilt\_behv.vhd* -- behavioral model for bifilt implementation, will be used for Synthesis with Behavioral Compiler.

*bifilt\_mult1.vhd, bifilt\_mult2.vhd, bifilt\_pipe.vhd* – replace these with 3 synthesized gate level implementations (1 multiplier, 2 multipliers, pipelined).

*tb.vhd* – testbench for use with 'behv', 'mult1', 'mult2' implementations (provides configurations for each).

tb\_pipe.vhd - testbench for use with 'pipe' implementation

*bifilt\_behv.log* – log file that has golden output results – output files all implementations should match these outputs.

## dsp dware.zip Archive

This archive unpacks to a *dsp\_dware/* directory (same as previous assignment). This only contains three files:

*behv/bifilt.vhd* -- edit the architecture to contain the architecture you created in *bifilt\_test/bifilt\_behv.vhd*. Synthesize the mult1 and mult2 implementations via this file.

*bifilt\_mult1.script* -- a dc\_shell script that uses *behv/bifilt.vhd* to synthesize a minimum resource implementation using Behavioral compiler. Create new versions of this script (*bifilt\_mult2.script*, *bifilt\_pipe.script*) to synthesize two multiplier pipelined implementations.

*behv/bifilt\_pipe.vhd* --- replace this with the architecture that will be used for the pipelined implementation – the only difference is that *irdy, ordy* are never negated after its initial assertion (new Txx value input every clock, new Tnew value every 4 clocks) <sup>3/26/2002</sup> BR

### Procedure

Complete the *bifilt\_test/bifilt\_behv.vhd* architecture and simulate in modelsim using the *cfg\_behv* configuration provided in *bifilt\_test/b.vhd*The results must match the bifilt\_test/bifilt\_behv.log results
Place the architecture from *bifilt\_test/bifilt\_behv.vhd* into the *dsp\_dware/behv/bifilt.vhd* file. Use *dc\_shell* and the *dsp\_dware/bifilt\_whd* script to synthesize a gate level implementation
Gate level implementation will be placed in the */gate* directory. Copy this to the *bifilt\_test* directory and simulate using modelsim – verify that the output results match the *bifilt\_behv* simulation results.

# Procedure (cont)

- Create a new version of the *bifilt\_mult1.script* such that a two multiplier implementation is synthesized
  - Call new script bifilt\_mult2.script, write the gate level output to gate/bifilt\_mult2.vhd.

3/26/2002

- Synthesize using dc\_shell; look at the report file and verify that two multipliers are used
- Copy the gate/bifilt\_mult2.vhd file to the bifilt\_test directory and simulate with modelsim – verify the output results match the bifilt\_behv results.

BR

# Procedure (cont)

- Create a new version of the *bifilt\_mult1.script* such that a pipelined implementation is synthesized that inputs a new *Txx* value every clock with outputs produced every four clocks
  - Call the new script *bifilt\_pipe.script*, must read the file *behv/bifilt pipe.vhd*.
  - Must create a new file called 'behv/bifilt\_pipe.vhd' that is only a slight modification of the original 'behv/bifilt.vhd' – *irdy* is never negated after its assertion
  - Use the configuration named cfg\_pipe provided in bifilt\_test/tb\_pipe.vhd to verify that the output results match the bifilt\_test/bifilt\_behv results.

BR

11

12

# Required Files for Submission

- · All files placed in directory called sim7
- ./bifilt\_mult2.script script for synthesizing 2 multiplier implementation; must read file behv/bifilt.vhd and produce file gate/bifilt\_mult2.vhd
- ./bifilt\_pipe.script script for synthesizing pipelined implementation; must read file behv/bifilt\_pipe.vhd and produce file gate/bifilt\_pipe.vhd
- ./behv/bifilt.vhd file read by bifilt\_mult2.script
- ./behv/bifilt\_pipe.vhd file read by bifilt\_pipe.script.

3/26/2002

BR

13

# Comments on Testbench (*tb.vhd*, *tb\_pipe.vhd*) Testbench computes 8 Tnew values using 8 sets of Txx values read from a 32-location memory (8 x 4 = 32). 6 different values of u,v used for each set of 8 Tnew values V=1.0, u = 0.0 V=0.0, u = 1.0 V=0.0, u = 0.0 V=1.0, u = 0.5 V = 0.75, u = 0.5 For the last two cases, might get a different value in the LSB than my provided golden file depending on the order of the

BR

multiplications (1-v|v + 1-u|u + Txx)

- Difference is due to dropping the least significant 8 bits.

3/26/2002

14