Close

2023 Hackaday Supercon Badge Hack - Vector Video

ben-combeeBen Combee wrote 11/08/2023 at 20:39 • 6 min read • Like

This year for SuperCon, I took the awesome vectorscope badge that the team had created and got it to play some short video clips from the Raspberry Pi Pico's 2MB of flash space on it's 1.8" round LCD screen with a GC9A01 driver chip.

This required trying out some new techniques, and I'm happy to publish details on how it works and how you can apply this to your own Pico projects too.

First, my GitHub repo for the hack is published at https://github.com/unwiredben/vector-video/.  This is a Platform.IO based project, so it's written in C++ and builds to a UF2 file you can flash onto the badge.  It's likely this will work on non-badge hardware like these Waveshare development boards I've ordered, but the code will need to be modified to adjust for the lack of a "user" button.

While doing development, I had a number of issues, but also some non-issues.  It wasn't hard to get the video files embedded into the UF2 image.  The RP2040 has a nice mechanism for dealing with it's attached SPI flash as a flat address space, with the hardware managing a memory cache to page in sections of the flash as needed.  From a program point-of-view, you just include the video as a read-only array of bytes.  There is an option with the PlatformIO tools to partition the flash to have part in program memory and part as a filesystem.  That may help with reducing the time to flash a new image, as you don't need to reload the movie data each time, but it will lead to additional complexity of the code to feed the video decoder.

Decoding Video

The video decoder I used was a modified version of pl_mpeg, an amazing single-file MPEG-1 decoder developed by Dominic Szablewski, aka phoboslab.  He originally published this in 2020, but I didn't notice this until earlier this year when he published an image format, Quite OK Image (QOI),  that got some attention on Hacker News.  The code is quite easy to read and adapt, and I first tried it out to make a movie player for the Badger 2040, an eInk-based badge by Pimoroni.

My original intention was to try to turn the player code into a MicroPython add-on, but I quickly realized that the memory usage wasn't going to make that viable.  When I first tried to play a short video clip, the badge hung.  This required using a patch to pl_mpeg's macros for overriding malloc/realloc to identify the source of the allocations, and I found that the code that allocated three frames of reference data to use during decoding was failing as there's just not enough of the 256K RAM on the RP2040 left to handle that.  Since the round screen has an effective resolution of 240x240, you can verify this by doing some math:

However, just storing the luma (Y) reference frames lets us fit into 172,800, which does fit into the available memory.  So, I modified the pl_mpeg code to have a luma_only mode.  This ended up being fairly easy -- there were three parts of the code where it was picking which plane to access.  In both plm_video_copy_macroblock and plm_video_interpolate_macroblock, the code runs the macroblock's instructions over all three frames, so omitting modifying the non-existent color planes was easy.  The tricky one was in plm_video_decode_block; I'd originally just aborted processing when the code got to selecting the color plans by doing an early return, but that left me with random macroblocks being colored all white in the output.  After some debugging, I realized that I needed to just skip the code that modified the Cr/Cb blocks, but still needed to do the other processing because there were intentional side effects to the decoder state. Once I fixed that, I got crystal clear decoding.

Displaying the Video Frames

The pl_mpeg code was running in a mode where it just returned a frame that's ready for output after each call to plm_decode_video.  I still had to get that data onto the screen.  This is where I hit some obstacles with the TFT_eSPI library that I'd picked.

The naive way to draw the pixels would be to loop from 0 to 240 on the x and y axis, read the luma value, then use tft.drawPixel to write it to the screen.  This works, but it's very slow, resulting in only a couple of frames a second. The problem with this technique is that drawPixel does a LOT of work.  It sets up a new SPI transaction, sends command to tell the display where you're drawing, then sends the color data.

There's a more efficient way to do this, and that's using the tft.pushColor API.  However, this requires some non-intuitive setup to use correctly.  First, you need to start the SPI transaction with tft.startWrite(), then set the window on which you'll be pushing data using tft.setAddrWindow.  Now, you can safely call tft.pushColor for all your pixels, followed by a tft.endWrite() to commit the transaction.

The TFT_eSPI.h header has many methods with similar names, and if you use them, you'll end up getting only partial writes or incorrect window sizes and the video won't show up all nice and properly aligned.

Colorizing the Frames

Since I was doing luma to RGB conversion in the drawing code, it was relatively simple to change how I did that mapping.  Just applying luma directly to R, G, and B was the simplest and resulted in a grayscale image.  I could also easily just map luma to one or two of those, leaving the other component at 0, to have the output colorized.  I also played with some other mappings, including a black & white mode that just used the highest bit of the luma value as a on/off flag and a rainbow mode where I applied different colorization methods depending on the Y coordinate of the pixel, producing a Pride Flag effect.

Going Full Color

On Sunday morning, I realized that full color decoding might work if the video size was reduced.  Going from 240x240 to 240x180 meant that we could decode all the planes with 194,400 bytes of memory, which still fit into the available memory.  This let me end my demo with the famous "rickroll" footage.  As the video was shot in 4:3, this ended up being closer to the original presentation than picking a 1:1 square out of the middle.  I modified the display code to center the image vertically, changing the tft.setAddrWindow call to start drawing partially down the screen.

Adding Some Static

I originally wanted to encode actual TV static as MPEG-1 to show between clips.  However, while MPEG-1 can be very efficient, it's not when presented with nearly random input.  So, I instead simulated static with a random number generator feeding into a grayscale pushColor-based drawing loop.

Scripting it for Presentation

The vectorscope badge has a lot of LEDs and buttons, but most are connected to the Pico through the TLC59283 LED controller which works as an output shift register.  I didn't tackle working with that, but instead relied on the user button which was connected to GPIO 19.  When I started the badge without the user button held, it just played the Star Trek video, but on holding the button, it would show all of the videos.  The user button also changed colorization modes while a luma-only video was playing and served to delay leaving the static display mode so I could time the start of new videos during my talk.

Future Directions

The pl_mpeg code also has a MPEG-1 Layer 2 audio decoder which I've not used.  I internally encoded the MPEG-1 clips without audio, and made sure no audio buffers were allocated at runtime.  However, with the badge's DAC output and ADC input that goes to a speaker, it would be interesting to see if audio playback would work.  That would also involve a bit of exploratory work on ensuring synchronization.  The current code only tries to keep the video at 30fps, but I think some of the clips were natively encoded at 24fps and are playing a little fast.

Like

Discussions