Close

How to fix USB on the jetson nano

A project log for Auto tracking camera

A camera that tracks a person & counts reps using *AI*.

lion-mclionheadlion mclionhead 04/13/2023 at 18:330 Comments

It became clear that USB on the jetson nano didn't work without ethernet being connected.  It was obviously broken & a lot of users have trouble getting reliable USB in general.  Power management for USB might have gotten busted when it was hacked for servos.  Maybe it was back EMF from the servos.  Maybe it just burned out over time.  Disabling power management with /sys & usbcore.autosuspend=-1 didn't work.  Ideally there would be a way to trick the ethernet port into thinking it was connected, but the world's favorite search engine ended that party.

The best hope was hacking the tegra-xusb-padctl driver to stay on. There's a kernel compilation guide on https://developer.ridgerun.com/wiki/index.php/Jetson_Nano/Development/Building_the_Kernel_from_Source

cd /root/Linux_for_Tegra/source/public/
export JETSON_NANO_KERNEL_SOURCE=`pwd`
export TOOLCHAIN_PREFIX=/opt/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-
export TEGRA_KERNEL_OUT=$JETSON_NANO_KERNEL_SOURCE/build
export KERNEL_MODULES_OUT=$JETSON_NANO_KERNEL_SOURCE/modules
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} tegra_defconfig

# Change some drivers to modules 
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} menuconfig
# change to modules:
# Device Drivers → USB support → xHCI support for NVIDIA Tegra SoCs
# Device Drivers → USB support → NVIDIA Tegra HCD support
# Device Drivers → PHY Subsystem → NVIDIA Tegra XUSB pad controller driver
# disable Device Drivers → USB support  → OTG support
# disable Device Drivers → USB support  → USB Gadget Support


make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} -j8 --output-sync=target zImage
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} -j8 --output-sync=target modules
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra CROSS_COMPILE=${TOOLCHAIN_PREFIX} -j8 --output-sync=target dtbs
make -C kernel/kernel-4.9/ ARCH=arm64 O=$TEGRA_KERNEL_OUT LOCALVERSION=-tegra INSTALL_MOD_PATH=$KERNEL_MODULES_OUT modules_install

Then there's a nasty procedure for flashing the jetson.  Lions just back up /boot/Image then

cp build/arch/arm64/boot/Image /antiope/boot/

cp -a modules/lib/modules/4.9.140-tegra/ /antiope/lib/modules/

cp -a build/arch/arm64/boot/dts/*.dtb /antiope/boot/dtb/

The trick is the NFS mount requires having ethernet plugged in, bypassing the bug.  The offending modules are 


/lib/modules/4.9.140-tegra/kernel/drivers/usb/host/xhci-tegra.ko

/lib/modules/4.9.140-tegra/kernel/drivers/phy/tegra/phy-tegra-xusb.ko

The USB hub is actually a RTS5411 on the carrier board, MFG ID 0bda. The jetson card has only 1 USB port which supports OTG.

Ethernet is provided by a RTL8111 on the jetson card.

The lion kingdom managed to hack phy-tegra-xusb to not turn USB off after unplugging ethernet, but it won't turn on until ethernet is plugged in. Interestingly, once phy-tegra-xusb is loaded it can't be unloaded.

You can get phy-tegra-xusb to call the power_on functions without ethernet but it can't enumerate anything until ethernet is plugged in.  There's a power on step which is only done in hardware.  The power off step is done in software.  USB continued to disconnect, despite enabling the pads.

A key requirement is disabling power management for some drivers

find /sys/devices/50000000.host1x  -name control -exec sh -c 'echo on > {}' \;                              

This kicks it up to 3W & starts roasting the heat sink.

The kernel outputs vdd-usb-hub-en: disabling but there's nothing about where the print is or any device tree specifying a hub. 

Verified a stock jetpack image has the same problem.  Software options were all busted.

Outside chatgpt, a few more keyword combinations revealed many bit banged ethernet projects which can simulate a cable connection.  They're short on any specific timing & voltage. 

https://github.com/cnlohr/ethertiny

Much troubleshooting revealed this, with the RX +/- pins on the jetson connected to some GPIOs on a 5V arduino.  

Then you pulse the + pin followed by the - pin.  It seems to require pulses of 100ns length every 16ms.  It doesn't matter what the order of the pulses is, whether they're on the TX or RX pins, whether there are resistors.  The link light doesn't turn on.  Activity blinks 3 times & goes dark with an exponential backoff, but the kernel reports a link up & starts enabling USB.

After enumerating, it still disconnected USB despite showing activity.  If the power management was all disabled, it would repeatedly enumerate the hub & disconnect.   Timing changes made no difference. 

This method only gives a solid link light when going into a hub. There are ways to bit bang packets, but they require a 20Mhz clock.  There is a more comprehensive ethernet MAC for an RP2040 but a clockspeed over 20Mhz is required.  It has to listen for packets from the jetson.  The lion kingdom could hack a bitbanged ethernet MAC many ways but in the interest of time, the next step is a W5500 SPI to ethernet gadget.  This provides a full ethernet MAC over SPI.  It could fit inside the enclosure.  It could probably work without its RJ45.

There's a trend of progressive failure.  Methods of enabling USB have gone from working to not working.  There are anecdotes of jetson nanos burning out over time when run without a fan.  It could be a marginal voltage or timer.  In the worst case, the USB ports could die permanently & I/O could be encapsulated in ethernet.

The W5500 arrived only to turn on USB just with its power being plugged in.  It was just the small load on the 3.3V rail.

Certainly a much easier hack than what lions were doing before.  Maybe the 3.3V load was what always turned USB on.  Maybe it was a goofy power management dependency, a missing component on the board, or a burned out load sensor.  Just a matter of waiting & seeing if this progressively fails.

Discussions