Close
0%
0%

PRUSS support for newer kernels

A communication framework between ARM and PRUSS on BeagleBone Black

Similar projects worth following
The aim is to build a framework for easy communication between the ARM and PRUSS. Currently these features are provided by libprussdrv and linux kernel’s remoteproc infrastructure. However both have their limitation and require developers to hack kernel drivers to optimize them for their application. This project would develop a lightweight, robust easy-to-use, yet powerful communication framework for ARM-PRU, which would make life for beaglebone users a lot easier.

Understanding vrings for PRU

The PRU subsystem comprises of 2 32bit RISC processors running at 200MHz. This allows us to run programs independent of the main ARM core. However all AMP configurations require some mechanism for inter processor communication. Most interactions can be broadly classified into categories :-

1. Instructions or commands which range from single to couple of bytes. Such instructions are designed by the programmer to instruct the other processor ( We are not referring to processor instruction set here ). Example : Sending flags to start capture, wait for some task completion, mark process as running, end capture or simply small data packets lets say direction and step values for a stepper motor in a 3D printer making use of PRU. Such data transfer is possible through

a. Directly writing to shared memory address spaces ( DDR, PRU DRAMs, PRU SRAM )

b. Make use of prussdrv library which is linux UIO driver ( userspace driver framework for generic memory mapped devices ).

c. Modify and compile this excellent remoteproc driver according to your application requirement for kernel 3.8. ( For advanced users ) Abhishek has briefly explains syscall/downcalls here.

d. libpru library for 3.14+ kernel. This was written as part of this project.

At most times prussdrv and libpru will serve the purpose.

2. The second type of interaction is streaming data between ARM and PRU. This is necessary for a different class of PRU implementations such as streaming input data from high speed sensor, output custom/existing protocols from PRU pins ( DMX example ), logic analyser ( BeagleLogic ) and several others you might think of. The PRU can even access the Ethernet controller to process rx/tx packets on the fly and and them over to ARM. Possibilities are huge.

This is possible using vrings ( virtio_ring ) which is a part of linux virtio ( Virtual IO ) drivers.

Virtio was originally designed to provide a common implementation for virtual IO device drivers in different hypervisor environment. The virtio ecosystem provide a Linux API to handle vdevs (virtio devices) such as registering device, probe,remove, configurations, interrupt handling, etc and secondly it provides a transport layer ( since virtio before everything else, is all about IO ). The transport mechanism consists of an abstraction layer provided by 'virtqueue' and implementation of this abstraction provided by 'vring'. Hence vrings are an implementation of virtqueue. But since vring was the only available implementation of virtqueue, both were collapsed into one [ virtio_ring ]. [ lkml merger discussion ].

So vdevs could eventually be a virtual PCI, virtual network device, virtual block device etc for the guest VM. This briefly explains virtio which is essential to understand what follows. More details on virtio for further reading [ 1 ] [ 2 ] [ 3 ].

Now, virtio when combined with linux remoteproc and rpmsg frameworks can be used for communication between processors. For the PRU, instead of going for rpmsg, the underlying vring has been used for this project. The reason being that rpmsg being a framework requires significant code on both participating processor. It provides for a bus with each processor assigned an address, thus providing a src-dest address based communication model for AMP configurations. It is best suited for hardwares such as OMAP4 which has a dual Cortex-A9, a dual Cortex-M3 and a C64x+ mini-DSP. Whereas the PRU not meant for heavy computation, has a limited code space ( 8KB for each core ). Thus, making using of vring and avoiding rpmsg code space and latency overhead seems logical.

This brings us to the pruss_remoteproc driver which is available for kernel 3.14 ( beagleboard's fork ).

Continued in next post ....

https://github.com/shubhi1407/PRU-framework/wiki/Remoteproc

  • Week 11 Progress

    Shubhangi Gupta08/19/2015 at 05:38 0 comments

    Developments:

    1. Exposed vring to userspace using misc device which provides streaming data from pru to arm.

    2. Proper code alignment and commenting. Not completed yet.

    Issue:

    Blocking read on misc device still facing synchronization issues. This involves correcting flow of pending buffer interrupts sent by PRU->ARM and their subsequent transfer to userspace by copy_to_user.

    Next Week:

    Wrapping things up, example code and documentation

  • Week 10 Progress

    Shubhangi Gupta08/19/2015 at 05:37 0 comments

    Developments:

    Earlier I was able to write to all 512 (or less buffers) from PRU but they were not being added back to the vring (as free buffers) after the data was consumed by host processor. This last roadblock (hopefully) has been solved hence resulting in successful transfers from PRU to ARM, even continuous streaming data using the vring :D


    Issues:
    One major optimization remains i.e I am able to get different transfer rates depending on what frequency one kicks (ie interrupts) the ARM. Kicking the ARM after writing to each 512byte buffer followed by a kick results in stalling the ARM because of too many interrupts within too short a time. Kicking after filling all 512 buffers have been filled in bu the PRU leads it to wait for ARM to consume
    buffers first before PRU can start using them again. So, in order to attain maximum throughput, an optimum value needs to be found out after which the ARM is kicked. The user also has freedom to decide this value on his own (in the pru firmware) depending upon his requirements.


    Next Week:
    Patches for 4.1 expose misc char device to stream data to user.

  • Week 9 Progress

    Shubhangi Gupta08/19/2015 at 05:36 0 comments

    Developments:
    I am running a slightly behind schedule. Instead of exposing vrings to userspace I worked on allowing custom callbacks for virtqueues which are executed after the kick. These callbacks are necessary consume pending messages. Pushed code to gh.


    Issues:
    Lately I have been able to spend a little less time than I should have. Need to cover up.


    Next Week:
    Finish previous weeks objectives as fast as possible and get back to working on 4.1 patches.

  • Week 8 Progress

    Shubhangi Gupta08/19/2015 at 05:36 0 comments

    Developments:
    1. Virtio based vring communication now works for PRU. However things are still within the kernel.
    2. Test firmware for pru cores which provide vdev info to rproc driver through resource table.Communicate data using buffers (vrings) and inform other processor of pending data using 'kicks' which are actually sysevents.


    Issues:
    Single biggest issue which was most troubling was to find location of resource table within PRU DRAM.
    Solution : the .resource_table SECTION specified in link (am33xx.cmd) file is written to a fixed address.


    Next Week:
    Expose vring to userland as char device and test performance.

  • Week 7 Progress

    Shubhangi Gupta08/19/2015 at 05:35 0 comments

    Developments:

    1. vrings are allocated automatically from remoteproc driver if they are present in firmware resource table. Currently 2 vrings are permitted ( RX and TX ) each with 512 bytes sized buffers. Number of buffers in each ring can be maximum 512

    Issues:

    1. Patches for 4.1 is postponed. It was eating up necessary time. More testing showed driver could not be probed again once removed. More files need to be patched.

    2. vring's physical address in driver and that written to pru's memory are slightly different. Will have to resolve this issue asap so that messaging passing begins successfully.

    Next Week

    1.Try to finish vrings and begin writing example codes.

  • Week 6 Progress

    Shubhangi Gupta08/19/2015 at 05:34 0 comments

    Developments:
    1. This week was dedicated to getting the driver run on latest kernel. pruss remoteproc driver now successfully compiles and runs on 4.1.1. Patches for kernel source will be up by tonight.


    Issues faced:
    1. Compiling a new kernel for beaglebone black and preparing a SD card for it is one mammoth task for someone doing it for the first time. I'll write a blog post on that on my hackaday project page later.

    2. Debugging several lines of kernel code which broke the driver when ported from 3.14 to 4.1.1


    Next Week:
    Get back to vrings.

  • Week 5 Progress

    Shubhangi Gupta07/01/2015 at 11:06 0 comments

    Developments:

    1. Fix major bug in driver which had gone unnoticed. Word offset given to read/write routines was buggy.

    2. Improved user library. Previously pru cores booted when remoteproc driver was probed. Now user has pruss_boot ( "fw_path" , PRU0/PRU1 ) and pruss_shutdown(PRU0/PRU1) routines to independently handle power to each core.

    3. Able to register virtual device with vrings using remoteproc.

    4. Better example application in which ARM writes 2 values to pru memory -> interrupts pru 1 -> pru 1 adds 2 values and writes result to new location -> pru1 sends interrupt to ARM -> ARM validates result

    Issues Faced:

    1. Have lagged behind a bit while trying to understand the intricacies of vring communication. Need to cover up.

    Next week:

    1. Finish and wrap up vring based message passing from both ARM and PRU end.

  • Week 4 Progress

    Shubhangi Gupta07/01/2015 at 10:42 0 comments

    Developments:
    1. Worked on user library code to allow individual PRU core boot and shutdown.
    2. Wrote PRU firmware to receive interrupt from ARM and reply to ARM with another interrupt.
    3. Make all three component work together : user program - driver - pru firmware.
    4. Error handling in library.
    5. Most importantly worked on repo to prepare it for tomorrow's evaluation. Prepared makefiles and added build procedure to Readme. git looks clean now.


    Issues faced:
    1. Circular queues already part of PRU-bridge. Need to discuss thoroughly with apaar regarding integration of these two to avoid code replication.
    2. Exploring TI's progress on pru vring implementation taking time.
    3. Cross-compiling 4.1 kernel is throwing errors.

  • Week 3 Progress

    Shubhangi Gupta07/01/2015 at 10:40 0 comments


    Developments:
    1. Send sysevents ( 0-63 ) from userspace to PRU throughr driver.
    2. Configure sysevent->channel and channel->host map though firmware resource table. Very easy to use method.
    3. Expose driver interrupts ( ie EVTOUT from PRU->ARM ) for userspace.
    4. Interrupt handling in driver, while user is notified about interrupt. User can wait for specified interrupt from PRU (indefinitely or timeout).
    5. User can give callback function along with wait. ( Ex. pruss_wait_for_interrupt ( EVTOUT1, mycallback ). mycallback is written by user
    It receives the EVTOUT number as argument if the interrupt was successfully received.


    Issues:
    1. poll function was posing trouble. It requires a dummy read on the file descriptor before polling to work correctly.
    2.Considerable amount of time goes in to find the best possible way to implement something which agrees to driver coding practices. Lots of resources scattered over internet. Collecting all important links along the way. Will write a blog post on issues faced and how they were resolved.
    3. Some time to figure out resource tables of PRU firmware.


    Next Week:
    1. Complete work on DDR memory circular buffer.
    2. Allow for custom modules to plug into remoteproc (so that advanced PRU users can use low level functions provided by pruss_remoteproc to utilize PRU more effectively as compared to user land implementation. This might take some time.

  • Week 2 Progress

    Shubhangi Gupta06/10/2015 at 12:06 0 comments

    Developments:

    1. Exposed appropriate sysfs attributes for data length,offset,memory type and data buffer to userspace.
    2. prussdrv like functions for reading and writing to DRAM0, DRAM1 and SHR-RAM work now, allowing
    a maximum of 4kB data to be written in one call.
    3. Tested writing 2D integer arrays to PRU mem and subsequent read. Example code on github.
    4. Dynamically boot/shutdown PRU from userspace at any time using driver-device bind/unbind function.
    Allows for rmmod pruss_remoteproc without --force option.
    5. Able to provide custom sysevt->channel and channel->host mapping to 3.14 pruss_remoteproc driver



    Issued faced:
    1. Abandon python for userspace library. Realized python is not feasible for writing code which interacts with hardware which has extremely limited space.

    In python, mostly everything is an object. Even an integer is an object which pushes its size to 12 bytes ie 8bytes of unnecessary infomation. With a PAGE_SIZE of 4096 bytes which to be the size of a sysfs file, we cannot afford python. Yes, there would be workarounds using this language itself, but when time on the platter is less, C saves the day.


    3. Had to decide whether to send data via sysfs or send down userspace pointer (to data buffer) to driver. Reading from sources advised against sending userspace pointers down to kernel. Hence former was chosen.

    For more info as to why sending pointers is a bad idea read this: http://www.makelinux.net/ldd3/chp-3-sect-7


    4. A persistent segmentation fault which took significant time to debug. Was due to dereferencing a null pointer.


    5. Writing to sysfs was triggering both show/store functions. Issue resolved through mentor discussion. [ use open( ),write ( ) etc
    instead of fopen ( ) fwrite ( ). ]


    Next Week:
    1. Allow INTC mapping and configuration from userspace.
    2. Ability to dynamically allocate larger circular buffers in DDR mem.
    3. Extend userspace library to allow DDR read/write.



    Code on github

View all 12 project logs

  • 1
    Step 1

    Kernel Module.

    1. Install the correct kernel headers

    sudo apt-get install linux-headers-uname -r

    1. Make backup of existing pruss_remoteproc driver (if any)

    sudo cp /lib/modules/$(uname -r)/kernel/drivers/remoteproc/pruss_remoteproc.ko /lib/modules/$(uname -r)/kernel/drivers/remoteproc/pruss_remoteproc.ko.back

    1. cd to /drivers/remoteproc in cloned repo and run 'make' command

    make

    1. Install the compiled module

    make install

    PRU firmware
    1. Install TI-PRU Code generation tool from this link
    2. Clone TI's PRU software package library from this link to any suitable directory.

    Example

    cd /usr/share git clone git://git.ti.com/pru-software-support-package/pru-software-support-package.git

    1. Edit /firmware/Makefile variable SWDIR to the directory in which you cloned repo in Step 2 (skip this step if you cloned in /usr/share)
    2. cd to /firmware

    make

    1. Install firmware if above steps are performed on host PC make install-tobb if above steps are perfomed on BBB make install-frombb
    User Library and Examples.
    1. cd to /userspace and run 'make' command. This will compile the library and examples make

View all instructions

Enjoy this project?

Share

Discussions

Łukasz Przeniosło wrote 05/16/2015 at 16:08 point

ooking forward to your work, so far with current pru support i find it more eficient to add a separate mcu on the board with bbb.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates