DTAP does not really replace JTAG : boundary scan happens at the boundaries, right ? But it should also be unintrusive, not only for the functionality but also the precious resources (space, time, energy...)

DTAP is more like a synchronous serial logic analyser and has some overlap with common JTAG uses, which have evolved a lot since the early days of scanning the pins of a chip. Today all the magic happens inside a silicon chip, like debugging a CPU, programming its internal Flash memory, you know the rest.

DTAP tries to solve the challenges of including a boundary-scan-like technology inside a FPGA or ASIC:

  • BILBOs are cumbersome, large, slower, and their use should be restrained.
  • Crossing clock domains is a risky business
  • Synthesisers consider all signals as similar by default, promoting the boundary scan's signals (clock and control) as "fast, high-fanout" and eating from the available P&R budget and damaging other more legitimate and critical signals (such as actual high-performance clocks)

So a typical BILBO strategy will actually reduce performance and other metrics. But Design-For-Test is not an option !

DTAP uses several methods to solve these problems :

  • Separate/decouple read from write.
  • Reading (logic probes) uses large balanced binary trees (with balanced control signals) spread over the entire design, driven by a Gray-code generator, to reach reasonable speed yet low power (since there is not a single large clock network to power and drive).
  • Writing (to internal control registers) is split into "slices" of shift registers with their individual clock network to prevent hogging the main DTAP clock signal.

DTAP must also be very easy to use/control with a microcontroller, an Arduino board, a Raspberry Pi, an ESP... All you need is one GPIO and a SPI port (or 4 GPIO total, if you bit-bang).

DTAP can use 3 signals if MISO and MOSI are shared (it's a half-duplex protocol).

Advanced versions of DTAP would work in 3 or 4 modes (detected by pull-resistors during chip reset) :

  1. Debug and Test mode (real-time control from an external host)
  2. Synchronous firmware loading (DTAP acts as SPI slave)
  3. Autonomous firmware loading (DTAP acts as SPI slave, with its own clock, to read a SPI Flash chip)
  4. Asynchronous firmware loading (DTAP acts as a serial receiver and must recover the host's clock period)

DTAP has no /MCRL or /RESET pin so it could still work when the target is held inactive. There are quite a few tricks to manage the clock domains while keeping the gates count low.