Close
0%
0%

Discrete component object recognition

Inspired by a recent paper, the goal is to develop a system that can recognize numbers from the MNIST database without a microcontroller

Similar projects worth following
MNIST number recognition is old, and boring. Wikipedia reports outstanding Error rates reached by complex ML algorithms and pre-processing trickery. Nevertheless, all these are overkill, heavy, slow and well... boring.

This project takes a lean approach to object recognition inspired by a recent paper from Mennel et. al., published in the March 2020 issue of Nature.

How simple can we go?
* Is a low resolution camera simple enough? Can a handful of discrete pixel sensors replace a picture and allow for accurate object recognition?
* What would be a good enough model? Would a a shallow neural network be sufficient? What about a simple decision tree?
* Is an Arduino too powerful? Can an array of voltage comparators beat the latest FPGA? What about a 555?

The project paves a way for a future of discrete component AI implementations.

The goal

The goal of this project is to try and have a simple machine vision system, trained to do one thing, do it as good as possible with the available resources and do it fast, really fast.

The idea came from a recent publication in Nature: https://www.nature.com/articles/d41586-020-00592-6, discussed in the Nature Podcast from 4th March 2020. This led me to wonder, how quick is quick and is there anything in between the system developed by Mennel et.al. and a typical camera + processor system?

The project evolved naturally as follows:

  1. Introduction
  2. Determine minimum requirements for sensor array
  3. Determine minimum requirements for ML model
  4. Determine minimum requirements for object recognition hardware
  5. Prototyping
  6. Final design
  7. Tests with a binary model signal
  8. Implementation of decision tree model using digital signal
  9. Using resistor transistor logic for the Binary Decision tree
  10. Cats vs dogs, Random forests and Neural networks
  11. Minimum model for discrete component neural networks
  12. Design of neural network nodes using discrete components
  13. PCB design of MNIST number recognition using discrete component neural network
  14. Translating a neural network to a discrete component memristor

The final objective is to have a system that is as close as possible to state of the art ML algorithms but implemented on discrete components for maximum speed at an acceptable complexity level.

Conclusion

The project was a partial success. Accuracy was below the objective of the project, nevertheless the concept worked for decision trees.

It has evolved in the meantime to include neural networks. Since where the project is going still fits the name it had on the day it started, it'll stay here for the time being.

The results showed significant potential for AI on discrete components, using sensors to obtain either analog or digital signals and allowing a tunable trade-off between complexity and accuracy.

NN_MEMRISTOR.zip

Kicad files for the memristor.

x-zip-compressed - 201.18 kB - 05/03/2020 at 23:38

Download

NN.sch

Sensor array schematic

sch - 21.98 kB - 05/03/2020 at 23:00

Download

NN.kicad_pcb

Sensor array pcb file.

kicad_pcb - 182.96 kB - 05/03/2020 at 23:00

Download

SuccesfulSimulationDiodes.circuitjs.txt

Fasltad circuit simulation for digit 0.

plain - 5.18 kB - 05/03/2020 at 20:07

Download

DiscreteComponentPrototype.R

R script to test the performance of the final prototype.

r - 8.56 kB - 04/07/2020 at 03:10

Download

View all 10 files

  • 1 × Arduino Uno R3
  • 1 × LDR 40k ohm
  • 1 × LDR 14k ohm
  • 1 × LDR 60k ohm
  • 3 × 10 k ohm resistors

View all 6 components

  • Translating a neural network to a discrete component memristor

    ciplionej05/10/2020 at 11:52 0 comments

    The goal of this task is to describe the steps needed to translate a neural network onto a discrete component implementation

    Sample neural network

    As a starting point we'll use a simple neural network with two inputs or signals and two outputs or classes. In the MNIST example, each input would be a sensor on the 4x4 matrix and the outputs or classes would be digits from 0-9.

    In the image below we have a neural network composed of two neurons, identified as 0 and 1. Since this neural network has no hidden layers, the output neurons are the only neurons.

    The calculations carried out at the neurons are as follows:

    • Neuron 1 = Bias (-0.4438) + Input X1 * weight_X1_1 (-0.30395) + Input X2 * weight_X2_1 (2.16963)
    • Neuron 0 = Bias (0.93455) + Input X1 * weight_X1_2 (0.81321) + Input X2 * weight_X2_2 (-1.66037)

    We'll now translate this onto a memristor as follows:

    • Each neuron's (0 or 1) signal will be calculated as a summation of currents
    • The bias term will be translated onto a current (positive or negative).
    • The Input * weight terms will be translated onto currents (positive or negative).
    • All the terms will be channeled onto a wire and the summation current will be transformed onto a voltage signal using a resistor. The signal will vary in intensity with the increasing current across the neuron.

    Positive value currents will be simply calculated as the product of the incoming voltage and conductance of a resistor (I = V x G or I = V / R). The resistor has to be chosen so that the conductance of the resistor equals the value of the weight. For example, in the neural network above, for Sensor X1 and the Output signal 0, the weight equals 0.81321 and the equivalent resistor would have a resistance of 1.2 ohms. The resulting current will be injected into the neuron wire.

    Negative value currents will be calculated by multiplying the incoming voltage by a suitable resistor. The resulting negative current will be subtracted (removed) from the neuron line using a transistor. The resistor has to be chosen so that the conductance of the resistor equals the weight divided by the beta of the transistor. For example, in the neural network above, for sensor X1 and the output signal 1, the weight equals -0.30395. Assuming the transistor has a beta of 100, the resistor would have to be (0.30395/100)^-1 = ~300 ohms. There is a better way to do this and this was suggested by the vibrant community in the .Stack in order to avoid the impact of beta variation as a function of Collector current. We'll stick to this simpler solution for now and discuss the alternative implementation below.

    Finally, the bias in neuron 1 is a negative one. Hence we need to use the transistor to subtract current from the neuron line. The value for the bias is -0.4438. Using the same procedure as above, we'll calculate the resistor value as (0.4438/100)^-1 = 225 ohms.

    Adding it all up

    In order to add the currents and get the output of the neuron, we use a wire joining all the currents and display the signal as intensity of a led. We could also measure the voltage with an ADC if we wanted to handle the output digitally.

    The resulting circuit is simple, if slightly dangerous if we pay close attention to the ammeter below the S2 signal. The reason for this is the amount of sensors. Since the number of sensors is small and the data poor, we are stuck with very large weights and biases. This means that if we move the sliders controlling the voltage from the sensors to their maximum values, we get currents in the order of a few amperes. When scaling up to as little as 16 pixels/sensors, this is not the case anymore and the weights and biases are much smaller. We'll tackle this issue later.

    An improved negative weight calculation

    Since the beta of the transistors is not a reliable number, I asked the .Stack to lend a hand to try to solve the issue caused by its variation. I got a lot of useful information. One answer led me to log this log, but all of them sent me on a way to solve...

    Read more »

  • PCB design for Discrete component neural network

    ciplionej05/03/2020 at 20:09 0 comments

    The goal of this task is to design a PCB that will allow a trained neural network to be used to detect MNIST numbers from an image.

    Sensor array and signal selection

    During the analysis of the results reported in the previous log, I realised that actually jumping from a 16-pixel sensor matrix to a 49-pixel sensor matrix had two significant effects. The first one was in the design of the sensor array. The second one and even more significant was on the memristor. Considering my limited experience in PCB design, and looking at simulations of the memristor in action, I decided to stick to 4 x 4 matrix using the intensity level from the sensors instead of the binary response. Also because the 7x7 matrix using the model with a binary signal did not perform that well and it had a very significant cost in complexity.

    Assuming like I did for the previous model an average of 1.5 components per node for the memristor, and using 49 inputs and 10 outputs, I'd have a whooping 450 components to mount (that'd be for every input, 10 weights and a bias, multiplied by the number of outputs). Anyway, my initial 1.5 component per node was a little bit off as we'll see later.

    The attached simulation can be run here and shows that the neural network is quite robust and small changes in some of the sensors do not make a significant difference in the result. Nevertheless, we'll need to test this in real life.

    The simulation actually helped me to see a significant issue with the design that I had not predicted. This was quickly solved with a bit of learning about passive components. Shocking as it was for me,  I didn't know electricity could flow backwards.

    Memristor design

    The initial memristor design needed a bit of math to adjust the negative weights. The positive weights were calculated simply by multiplying the input voltage by the conductance of the resistor, the result being a current.

    For the negative weights, I needed to subtract the current, and this was achieved by using an npn transistor that would conduct the current in the base multiplied by beta (I didn't even know this number existed, learning as we go long). So, all things considered, I found out that beta was not constant with the input current.

    If anyone knows of a type of transistor that has a beta that has a small variation with current, please send me a comment. I'd really appreciate any help on this.

    I tested the impact of a variation in beta in the simulation and found that this was also not very significant on the result of the neural network. Then again, we'll have to test this in real life.

    So, my problem appeared when I put an LED at the back of the digit line. In order to see whether it's one digit or the other, I chose to add an LED with a resistor. The problem was that the current in some conditions, chose to flow backwards instead of forward in some conditions. Since the signal is a current, I could not afford losing current to other part of the circuits. This was quickly solved with a diode in line with the input resistor as can be seen in the video below.

    This is an incredibly basic concept but up until this time I had never actually needed a diode, nor I knew very well how they worked. That last bit still holds today.

    Instead of having nodes with a 1.5 average component count, I had now two. Initially the nodes with positive bias had a resistor and the ones with negative bias a resistor and a transistor. In big numbers average 1.5. Now I had a diode and a resistor for the positive bias and a resistor and a transistor for the negative bias. Total count was now 16 inputs , 1 weight per input plus a bias for each multiplied by 10 outputs, 2 components per node. Theoretical part count 340. The actual value came a little bit higher due to connectors and LEDs, but close to the mark.

    So, with my small simulation out of the way, I proceeded with a bigger simulation, with all 16-inputs weights punched into the circuit plus the bias. I then...

    Read more »

  • Discrete component neural network node implementation

    ciplionej04/27/2020 at 17:08 0 comments

    The goal of this task is to develop a solution to implement the neural network nodes using discrete components.

    Initial idea

    The initial idea was to use analog multipliers based on a SE discussion. The source also references an additional solution using a MOSFET, but I couldn't find a way to implement this. The other solutions were an AD633 analog multiplier and the MPY534, the latter a bit more expensive.

    Either way, both solutions theoretically could work, nevertheless, it could get expensive quite fast. A quick back of an envelope calculation is, I'd need 10 nodes, with 16 multiplications and additions, then I'd end up needing 160 multipliers. The cheapest is the AD633, and it goes for around 10€ a piece. For 160, they come down to around 7€ each, or 420 € just on multipliers. That's a hard sell.

    There was another solution mentioned using log-add-antilog OpAmp circuits. These would need also log amplifiers, and it becomes very expensive very quickly as well. In the end, the easiest to implement, and cheapest, was the analog multipliers.

    All in all, this seems like a dead end, not even taking into account the fact that I have negative numbers to multiply. Maybe this is also possible with analog signals, but I cannot wrap my head around it.

    I got it all wrong

    It occurred to me that if this is so simple to implement, then someone must have done it before. So I started to do some research and got some papers from the 80s where people where designing VLSI chips to deal with neural networks. So, I cannot do VLSI, what else have you got?

    More recently, the work from wolfgangouille stands out with the very cool implementation of neurons in his very cool Neurino project.

    A bit more search got me to memristor bars from the nice video of Mr. Balasubramonian explaining his implementation. Now, memristor bars are really cool from the point of view of implementation, you've got voltage signals coming in, a bunch of resistors and conductors and finally a current output that should be transformed onto digital signal or voltage if needed. Hey, this is great!

    Not so fast, we still have to deal with the negative values on the weights.

    Some of the weighs that came with the model were negative. Now, memristors work by turning the input signal (a voltage) onto a current by passing through a resistor. The resulting current is the product of the voltage by the conductance of the resistor. So how on earth am I going to come up with a negative conductance resistor? Well, since I'm not EE trained, I didn't really know this was possible but Wikipedia proved me wrong.

    Even though there are some conditions in which some semiconductors seem to show this behaviour, it didn't seem like an easy thing to throw in my NN. We needed a different solution.

    Turning the problem upside down, I tried to reflect on what had to take place in my model circuit.

    Case 1. Positive weight

    In this case, the weight would be represented by a resistor, the voltage multiplied by the conductance would render a current. The current would be added to the current coming on the Digit 1 wire and would be transformed onto a signal at the exit.

    Case 2. Negative weight

    In this case, instead of putting current into my wire, I'd need to substract the current from the wire. So, I just need to find a solution to remove current from the wire, right. Well, put like that it's more or less easy. The problem might be finding a linear way to do this, but a simple implementation can be found below. The negative weight is represented by a transistor driven by the current created by the voltage coming from the sensor and a resistor. As the transistor gets current across the base, it allows some of the current traveling through the wire to escape. Hence, the actual current in the Digit 1 ampmeter goes down as the voltage from the sensor increases.

    Below is a short video of the memristor with the negative weight in action.

    As the Sensor...

    Read more »

  • Discrete component neural network

    ciplionej04/25/2020 at 00:19 0 comments

    The goal of this task is to train the simplest neural network that could recognise digits from the MNIST database and make it compatible with a discrete component implementation.

    Minimum Model Neural Network

    This one is an easy one. The simplest neural network (NN) is one with just the inputs and the outputs, with no hidden layer.

    This neural network has a total of 16 inputs which represent the pixels on the 4x4 matrix described in the Mimimum sensor log and 10 outputs representing the 10 digits or classes.

    The confusion matrix for this model can be found below.

           0    1    2    3    4    5    6    7    8    9
      0  768   58   17   17    8    3   75    8  195   17
      1    0 1240    6   14   43   14    6   13    7   13
      2   26   91  708   61   57    0  171   28   63    5
      3   51  134  109  737   20   11   18  127   23   35
      4    4   94   13    0  592    6  112   95   27  191
      5  109   71    2   45  120  303  111   99  174   31
      6   42   45   44    0   95   11  945    2   26    2
      7    6   70    8   11   26    2    7  989    1  119
      8   45  189   19   34   16   20   61   30  656  102
      9   12  103   14    6  110    4   20  387   25  498

     The accuracy of the model was not bad, especially when compared with the decision tree or random forest models trained in the previous tasks.

            0         1         2         3         4         5         6         7         8         9 
    0.5256674 0.5608322 0.4909847 0.5072264 0.3634131 0.2667254 0.5270496 0.4876726 0.3829539 0.2939787

    Some numbers don't fare well at all, but let us not forget this is a 4x4 matrix, even I couldn't tell the image below belonged to a 0.

    Of course a higher pixel density will improve the accuracy of the model but I'm more interested for the time being in simplicity at the expense of accuracy.

    So this was the simplest NN model I could come up with, so we introduced some complexity to see how much we could improve the accuracy. The first model had a single hidden layer with 16 nodes as shown below together with the accuracy values for each digit.

            0         1         2         3         4         5         6         7         8         9 
    0.5044997 0.6573107 0.5772171 0.5874499 0.4180602 0.5212766 0.6659436 0.4793213 0.4176594 0.3957754 

    The accuracy increased in some digits, but not so much as to justify the increase in complexity. The second model had two hidden layers with 16 nodes each, as shown below together with the accuracy values.

            0         1         2         3         4         5         6         7         8         9 
    0.4885246 0.7461996 0.5099656 0.4409938 0.4449908 0.4448424 0.6528804 0.5533742 0.3882195 0.2722791 

    The accuracy of the second model with two hidden layers was better than with one layer, but not for all digits and the cost is realy high in terms of complexity. Overall, the best trade off between complexity and accuracy was the first model.

    Breaking down the maths

    As neural networks go, they're as simple as they come. The nodes receive the inputs, multiply them by a weight and add a bias. Piece of cake.

    The result is then for each node:

    That means that knowing the weights and bias of an output node we could in principle calculate its probability with very simple maths.

    For example, for the first model shown at the top of the page, the weights for each sensor and the bias for the output node for digit "5" are:

           Input      Weight       Signal       Product
    1       Bias -0.05691852          N/A   -0.05691852
    2   Sensor 1   0.5397329            0             0
    3   Sensor 2  -0.4359923  0.004321729   -0.00188424
    4   Sensor 3   0.3074641    0.2772309    0.08523854
    5   Sensor 4  -0.4892472 0.0004801921 -0.0002349326
    6   Sensor 5  -0.5862908            0             0
    7   Sensor 6    0.231549    0.4101641    0.09497309
    8   Sensor 7  -0.0697618    0.4403361   -0.03071864
    9   Sensor 8   0.5010364     0.164946    0.08264393
    10  Sensor 9    1.150069   0.04017607    0.04620524
    11 Sensor 10   0.1367729    0.3277311    0.04482472
    12 Sensor 11  -0.3862879    0.3078832    -0.1189315
    13 Sensor 12   0.7872604   0.08947579    0.07044075
    14 Sensor 13   -0.273486  0.009043617  -0.002473303
    15 Sensor 14   0.3749847    0.3589436     0.1345983
    16 Sensor 15   0.5307104   0.05786315    0.03070857
    17 Sensor 16  -0.8331957            0             0
                              Total:         0.378472

     Multiplying the response array from the sensor and adding the bias I can get the result for digit 5, in this case, 0.378472.

    Easy, so now we need to implement this using discrete components.

  • A cat, a dog and a number walk into a bar

    ciplionej04/21/2020 at 00:49 0 comments

    The goal of this task is to look into the future and determine other potential ML algorithms that could be implemented using discrete components.

    Cats vs Dogs

    MNIST digit recognition was nice to start with, but it's kind of the "hello world" of object recognition. Hence in the last log I focused a little bit on what could be an interesting challenge as a goal to bring Discrete component object recognition to the masses.

    An appealing idea was to get the discrete components to tell apart cats versus dogs. That would have been perfect, so I set sail to try and have a decision tree tell apart cats and dogs.

    The dataset was the cats vs dogs dataset from Kaggle, imported into R using the imageR package to crop the pictures square, resize them all to 100x100 pixels and reduce the RGB channels to a single grayscale intensity scale.

    The images were nowhere near as pretty as the original, but I could tell apart a cat from a dog without much effort, so I expected the algorithms to do the same.

    The decision tree was trained using the rpart R package and the accuracy was calculated using the same metrics used in the previous logs. In short, the number of identified cats was divided by the total of (identified cats + false cats + false dogs).

    The accuracy results were short of appalling, no matter how complex the decision tree was made or how large the training set used.

     Confusion matrix
         cat  dog
    cat 1034  566
    dog  827  773
    
    Accuracy
          Cats       Dogs 
     0.4260404  0.3568790 

    So after looking at the pictures for a while I realised that DT were never really going to cut this, no matter how complex they were since the cats were always in different places in the image, with many other artifacts appearing around them. So I thought, this called for random forests (RF).

    RF should address all of my issues, they'd build trees in different places around the image and get a better weighted response. Nevertheless, if this was indeed to be a Discrete Component affair, then I had to try and limit the size of the model. The R caret package has a standard random forest size of 500 individual trees. There is no way that is going to be built no matter how many years I had to stay at home. So I tried first with 10 trees, then 20, then 50, then 100 and then I realised this was also not going to fly.

    The accuracy was indeed better, but not really the improvement I was expecting.

     Confusion matrix     
         cat  dog
    cat 1043  557
    dog  652  948
    
    Accuracy      
         Cats        Dogs  
    0.4631439   0.4394993

    At this point I started to read about tree depth, impact of number of trees and realised that I needed a lot more trees and nodes to make this work. Cats and dogs was not going to be easy.

    Nevertheless it gave me an idea. What if I could implement more trees in the MNIST model, whilst keeping the number of nodes reasonably low?

    MNIST was not so boring after all

    So I came back to the MNIST dataset and tried to test the null hypothesis:

    "For the same overall number of nodes, decision trees have the same accuracy as random forest models"

    I started with a random forest with a single decision tree, maximum 20 nodes. The accuracy results were:

            0         1         2         3         4         5         6         7         8         9 
    0.4165733 0.5352261 0.3046776 0.2667543 0.3531469 0.0000000 0.2709812 0.3712766 0.1717026 0.2116372 

    Not pretty, but in line with I had already obtained in the minimum model using 10000 records.

    So, with the benchmark set, I trained a random forest with two decision trees, maximum 10 nodes each. And the results were:

            0         1         2         3         4         5         6         7         8         9 
    0.4800394 0.2957070 0.2172073 0.2161445 0.2116564 0.1522659 0.1727875 0.1840206 0.1464000 0.1265281 

    Surprise! For most digits, the random forest fared much worst than the single tree. There goes my null hypothesis.

    I wouldn't give up so easily so I increased the number of nodes and trees with more or less the same results.

    Only once I really increased the number of nodes to 1000 did I notice a difference between the two algorithms. For a single 1000-node...

    Read more »

  • RTL Digital Discrete component object recognition

    ciplionej04/13/2020 at 01:45 0 comments

    The goal of this task is to actually make the digital implementation work.

    Forget Mr. Fields

    So, I had a go at the why the previous circuit was not working, and I found out it was just messed up in so many dimensions it's hard to name them.

    In the end, I spent a lot of time learning, which I think that's what projects are for. I've got a much better grasp about the behaviour of NPN transistors and RTL, previously all dark arts for me.

    The circuit below describes the actual implementation of the left arm of the above decision tree implemented with RTL. I couldn't attach the circuit file, something wrong with the HAD.io platform today.

    So, implementation via RTL is possible, but a PIA, even for simple trees. Even locked up at home with nothing better to do.

    A much better solution would be to implement this via e.g. 74AS logic gate circuits. These have a theoretical propagation time of 2ns for the faster chips. The most complex decision tree discussed in the Bonus Track log, was 11 levels deep. That means that adding up the times of all the gates, the response time would be of 22 ns, or a detection frequency of 40 MHz. This does not tell the whole story since half of these would have to carry inverters on the way, but still, not very far off from these numbers. Not bad for a bunch of transistors.

    Conclusion

    Finally we have reached the point where we have managed to implement both analog and digital solutions to the Discrete Component Object Recognition System.

    From an implementation point of view, the digital solution is the most robust, the fastest one and the one least likely to cause headaches. It's also the most forgiving in terms of sensors since it can readily work with LDRs, despite their relatively slow response time.

    Future work

    Well, wouldn't it be nice if more complex images could be recognized instead of just the MNIST numbers? What about cats and dogs?

  • Digital discrete component object recognition

    ciplionej04/10/2020 at 16:35 0 comments

    The goal of this task is to make a prototype of the digital version of the discrete component object recognition system.

    Digital implementation

    Once it was clear that there was a digital pathway, I set forth to building a prototype.

    The first prototype was based on the leftmost pathway of the decision tree below.

    This would allow us to identify 100% of the "1" with an accuracy of 53% and about 30% of the "8" with an accuracy of 20%. This accuracy of 20% doesn't sound like much but let's not forget that a monkey pulling bananas hanging on strings would get 10% and this is a proof of concept. We can increase the accuracy, but I'd need a lot more sensors, and probably a more robust setup than dangly Aliexpress breadboards.

    By the way, does anyone know of a trick to secure resistors in place on a breadboard? The connection is sketchy and drives me mad.

    So, I sketched the circuit on kicad on the kitchen tiles and got to build onto a breadboard.

    Later on I realized that there was a small mistake on my drawing but this got quickly corrected on the breadboard.

    Below is a picture of half the setup, not including the sensors. This is probably not what comes to your mind when someone asks you to picture Artificial Intelligence.

    In the end, as usual, the circuit didn't work. I used the Falstad circuit simulation tool to see what was going on and found out that the voltage I was using was too low to drive the LEDs at the other end if the circuit.

    There was not enough voltage to even get a signal and read from the Arduino so it was easier to play with the values on the DC source until a reasonable signal would come out of the other end. In the end, it was a 12V DC source that did it.

    Even spending a few hours changing resistor values and rerouting signals did not get me any closer to using both LEDs successfully, After a bit of thinking about the circuit, I realized that once one resistor upstream changed so did the other voltages downstream, so nothing was completely black and white after all. I really need some help to figure out how to implement the circuit for this system.

    In the end, I couldn't let my lack of know-how on testing the feasibility of the concept, so I resorted to my good old trusty Arduino to do the ADC for me and let me know whether a 1 or an 8 was being identified.

    For the sensing side, The relevant pixels were drawn on the screen in order to setup the alignment on the breadboard and mount the sensors.

    Once the sensors were mounted, the alignment was verified and the setup was ready for testing.

    Test drive

    For the tests, the same MO was used as in the previous tasks, i.e. cycling digits on the screen, reading the voltage via the ADC on the Arduino UNO, print()ing the values to serial and reading the serial from Rstudio where it would get compared to the number displayed. The information would then be recorded on a confusion matrix.

    All went well with the Arduino, the ADC behaved beautifully and the signals for the binary 0 and 1 were clearly distinguishable from the voltage readings. Nevertheless, once I started testing, I realized that my sensors were too big for the pixels.

    Too big sensors? No problem I thought, we just increase the pixel size. The matrix was averaged to reduce the pixel size by half but since numbers tend to be thin by nature, then a lot of traces got erased during the translation from gray scale to black and white. This meant that even I could not tell the numbers apart.

    As W.C. Fields said:

    If at first you don't succeed, try, try again. Then quit. No use being a damn fool about it.

    Conclusion

    It's probably possible to make a binary system work, but with my current limitations I could only continue with some help on getting the circuit to do my bidding, and smaller sensors to measure said pixels.

    I'll try to find a way to make the sensors smaller, or get smaller sensors. To be continued.

  • Bonus track

    ciplionej04/07/2020 at 15:56 0 comments

    Whilst thinking about the results obtained and licking my LDR inflicted wounds, I thought about a way to implement this object recognition system that would not involve fiddly hard to measure intensity levels.

    It occurred to me that I could change the intensity levels from 256 shades of grey, to a binary 0/1, i.e. if the intensity was higher than 127, then 1, else 0. In this way, the pictures become black and white, instead of grey shades, and we have new implementation possibilities.

    Up til now we needed to measure an intensity, translate it onto an analog signal (voltage), compare it to the reference voltage and get a digital signal (In each split node: Yes/No, Left/Right, etc).

    Now, we have a binary signal, we do not need to fiddle with intensity levels, analog signals, we do not need voltage comparators.

    Heaven.

    Below is a decision tree modeled on the same MNIST dataset but using the new binary system instead of the intensity level. The accuracy is in line with the models obtained in the Minimum Model task using all the default settings in Rstudio and the rpart package.


    The confusion matrix below shows that a lot of digits get miss-classified. Nevertheless, this model has not been optimized nor iterated in any way. The columns represent the digit shown and the rows represent how they were classified.

           0    1    2    3    4    5    6    7    8    9
      0  757    4    1   28    6   51  149  131   32   23
      1    6 1036   17  151    0    7   18   67   55    1
      2  120   78  458   26   39    2  168  110  169   22
      3   43   23   14  864   24   65    5  129   47   31
      4    3   27    5   31  665   28   64  220   23   88
      5  118   30   20  166   72  211   35  215   88  128
      6   67   35   50   24  102   19  585   61  268   24
      7    4   28   10   21   31   57   42  961   13   31
      8   36  143   51   68   19   34  205  146  391   40
      9    5   40    2   64   91  154   51  330   19  462

     The overall accuracy was just over 53% and the individual accuracy is shown in the table below.

            0         1         2         3         4         5         6         7         8         9 
    0.4779040 0.5866365 0.3362702 0.4736842 0.4323797 0.1406667 0.2966531 0.3686229 0.2116946 0.2876712 


    There is a huge improvement potential with this change.

    • There is no need to calibrate the LDRs, the change in resistance between the 0 and 1 is enough to give a binary output
    • There is no need to use voltage comparators, now transistors is all that is needed, implemented as NOT gates.
    The simplified system is depicted below. All we'd need to implement this object recognition system is:
    1. 11 LDRs
    2. 11 NOT gates (can be built using 1 transistor and 2 resistors per gate)

    Adding complexity

    By changing the complexity parameter (cp), we can increase the accuracy significantly, at model complexity cost. The plot below shows the increase in accuracy as we reduce the cp value.

    As cp decreases so does the number of splits. A model with a cp of 0.005 for example, has a sizeable jump in accuracy to 62 %, with a still manageable 24-node tree.

    A step up to 0.001, increases accuracy to a respectable 74 %, with a 91-node decision tree.

    There's a clear trade-off between complexity and accuracy, though more specific optimizations could potentially be carried out in order to obtain a better accuracy without a high complexity penalty.

    Conclusion

    We have reached a solution that could allow us to implement a really simple object recognition system using readily available components.

    The accuracy could be considered to be within the goals of the project at a respectable 75 % plus if we're willing to live with the complexity level.

    As a proof of concept this project allows for potential improvements by using more complex decision trees and replacing more elaborate, slower and less power efficient systems in Object Recognition applications.

  • Final concept

    ciplionej04/06/2020 at 01:07 0 comments

    The goal of this task was to validate the concept using discrete components, the reduced decision tree obtained in the Minimum hardware task and measure its accuracy.

    The model

    The model to be used was the one below, discussed and analyzed in a previous task.


    The hardware

    Due to stock and time limitations, only one LM393P voltage comparator was available, hence two splits could be implemented. The prototype was only going to be able to detect "Number 1" or "Numbers 4 or 7" by analyzing the signal from pixels 6 and 3. Nevertheless, only one of the LDRs used throughout this project was actually able to measure with acceptable drift and jitter so only digit "1" from the MNIST database was detected.

    Since we had a single digit to identify, a single LED was used to inform whether the digit being shown had been identified as a "1". In order to build the confusion matrix, the LED voltage was recorded by the Arduino UNO.

    The video below shows the setup in action.

    The confusion matrix is attached below.

            [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
     [1,]     18   11    4    2    3    1    2    0    1     0
     [Not 1,]  9   11   16   22   15   19   16   15   22     0
    

    The accuracy towards number 1 was 35.2%. This was below the performance obtained with the micro-controller which is disappointing. I really wanted to get a better value using only discrete components.

    Below is a close-up of the setup including the lonely LM393 in the center of the control breadboard and the single LDR on the "camera" breadboard. Since the screen brightness did not match the activation level for the voltage comparator, two 1k.ohm resistors were used in parallel with the LDR to align the sensitivity of the sensor to the screen intensity.

    The final fine-tuning was carried out by displaying the target intensity on the screen and adjusting the screen brightness until the LED would be triggered around the level defined by the decision tree.

    Speed test

    In order to see how fast the system could detect the numbers, the numbers were drawn at increasing speed until the system could not catch up.

    The test started at 1 Hz, was then increase to 10 Hz, then 20 Hz and 50 Hz.

    The surprising result was that the system could keep up, but the screen could not.

    In the video below, we can see the system recognizing number 1 at 20 Hz.


    Nevertheless, the system could not be run any faster than this speed. The reason for this is that the screen could not draw the digits at the 50 Hz target speed.

    Future work

    Stock MNIST performance

    The project was carried out using an averaged version of the MNIST dataset wherein the matrix was reduced to a 4x4 matrix from the original 28x28. The question that remains is, what would be the accuracy of the system when using the original database.

    Implementing a more robust decision tree

    This project proved that it's possible to implement a simple object recognition system using decision trees and discrete components. Nevertheless, a single split hardly represents a decision tree. A more complex decision tree would have been nice to test, given more time and the availability of parts.

    Using the right tools

    LDRs are definitely not to be used for these kind of projects. Maybe you can make them work, but the ones I had showed to be pretty unreliable.

    A more robust system could probably be built with better sensors given the time. As Starhawk suggested, maybe photodiodes could be used instead with better results.

    More voltage comparators for a larger tree would mean also that we'd need plenty of individual discrete voltage values. In order to have those delivered to the right comparators, a lot of voltage dividers could be used, but it could get complex pretty quickly. I wonder whether there are better tools.

    Measuring the speed

    Well, speed was one of the main drivers of this system and checking how fast it could go one of the objectives. Nevertheless, the screen refresh rate was the limiting factor in speed measurement.

    From my point of...

    Read more »

  • Prototyping

    ciplionej04/04/2020 at 02:18 0 comments

    The goal of this task is to prototype the solution using a micro-controller to validate the concept

    The Arduino prototype

    An Arduino Uno did not really match the minimum hardware philosophy, but I had nothing less fancy to try it on. And the focus was on making sure it worked on the micro-controller in order to move on to the next task.

    The prototype was set up to detect 3 digits from the minimum sensor model below.

    The digits to be detected were "1", "4" and "7", using a total of 3 LDR sensors for pixels 3, 6 and 10.

    The set-up worked as follows:

    • Each LDR was a part of a voltage divider with a 10 k.ohm resistor.
    • The LDRs were of 14 k.ohm (pixel 10), 40 k.ohm (pixel 6) and 60 k.ohm (pixel 3).
    • The sensors were aligned with the matrix on the screen in order to capture the the analog output on each pixel.
    • The matrix representing the number was drawn on the screen of a laptop using Rstudio.
    • The response of the LDRs was calibrated using an intensity sweep before each test.
    • The resulting voltage was read via the ADC on the Arduino on pins A0, A1 and A2 and 3 simple if-then loops represented the splits on the Decision tree.
    • Both voltage and likelihood of a number being identified was print()ed to the serial monitor.

    The matrix was drawn on the screen and cycled trough the dataset.

    Finally, the jig was placed in front of the screen. Both displayed digit and detected digit were recorded and filled a confusion matrix to verify the accuracy of the setup.

    In the end, the setup worked as expected, with the numbers 1, 4 and 7 being identified, sometimes.The accuracy, nevertheless, was appalling.

    All in all, it was mighty difficult to get a consistent reading out of anything other than digit "1". And of course that depends on your definition of consistent.

    Below is the confusion matrix for the prototype. Columns denote the number shown, rows the number identified by the decision tree. Note that if it was not a 1, 4 or 7, the model was set up to classify everything else as a 0.

          [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]  [,0]
     [1,]   15    1    5    6    1    2    3    0    4     0
     [2,]    0    0    0    0    0    0    0    0    0     0
     [3,]    0    0    0    0    0    0    0    0    0     0
     [4,]    5    1    1    3    5    0    6    0    5     0
     [5,]    0    0    0    0    0    0    0    0    0     0
     [6,]    0    0    0    0    0    0    0    0    0     0
     [7,]    3    9    1    7    2    8    0    6    3     0
     [8,]    0    0    0    0    0    0    0    0    0     0
     [9,]    0    0    0    0    0    0    0    0    0     0
     [0,]    0    0    0    2    2    0    2    0    4     0

     The accuracy of digit 1 is just over 33%. For number 4, it's 7 % and for number 7, well, not a single 7 was identified correctly.

    It was well short of the accuracy I was expecting to obtain with the setup so a bit of troubleshooting was in order.

    Debugging

    Cross-talk

    In order to verify whether there was any issues with the intensity of neighboring cells on the matrix affecting the reading of the sensors, the setup was modified using a molding balloon cut to length in order to shield the sensors from light coming from the sides.

    The resulting setup with the protected sensors is shown below.

    There was no significant difference in the overall accuracy, so this was not really what was causing havoc in the system.

    Sensor accuracy

    The decision tree splits happened at very low intensity values for these sensors, i.e. the sensors needed to measure values very close to the edge of their envelopes. The plot below shows the voltage versus intensity for all three sensors and the area where they'd be causing a split on the decision tree.

    Even though the sensors are close the the edge of the envelope, there's plenty of measurement real estate available. This was probably not the cause for the accuracy issue but the plot shows that there was some jitter in the signal which is shown as deviations in the measurements, but only in some sensors.

    Time series

    Finally, it was time to check whether the sensors could be drifting with time. The plot below shows consecutive reads of a fixed intensity value during around one minute.

    The sensor measuring pixel 6 shows a large tendency to drift. Since this sensor controls...

    Read more »

View all 14 project logs

Enjoy this project?

Share

Discussions

Starhawk wrote 04/04/2020 at 00:54 point

A couple of thoughts...

One, LDRs have *unbelievably* coarse tolerances. You are absolutely guaranteed to have to calibrate your voltage dividers individually, to even out all the imbalances they cause. A better choice might be an array of BPW34 photodiodes each driving JFETs -- photodiodes are photovoltaic devices (in fact, your standard solar cell is just a really massive photodiode) instead of being photoresistive. Yes, it increases overall component count, but repeatability and scalability to larger production and application numbers/cases goes up almost infinitely because you don't have to spend multiple afternoons arguing with the sensors to get it to work in the first place.

Two, back in the days of 8bit CPUs being common, and when microcontrollers were still "one-chip microcomputers" because the term "microcontroller" had yet to really catch on, NatSemi put out a bunch of ADC#### chips (that's "ADC" followed by a four-digit number -- ignore the rest of the name, it's almost always just packaging info) that were really good stuff. You can still get a bunch of em on eBay if you look for em right. "NOS" ("New Old Stock" -- i.e., purchased several years in the distant past and then promptly shoved in a drawer, never to be retrieved or used, until recently pulled out for sale) is the keyword there.

Three, if all else fails -- https://en.wikipedia.org/wiki/Resistor_ladder ;)

  Are you sure? yes | no

ciplionej wrote 04/04/2020 at 22:43 point

Thank you for your thoughts. They couldn't have been more timely.

Working with LDRs has been... interesting. Even calibrating every time I run, the values are quite different, and the differences are in the range of what hey should be measuring. So, yes, photodiodes will probably be in my BOM instead of LDRs.

With regards to the ADCs, I found those NOS parts. They do address the number of channels issue making implementation a breeze. The speed on the other hand is slower than I could find today. What I do not understand is the why they are so slow. The ADC part should not take that long. Is most of the time per read taken up by the muxing?

And finally, the resistor ladder got me very excited. The reason for this is that initially, I wanted this project to be called "Passive component machine learning".  Since electronics is not really my area, I had to read up on which ones were passive components versus active components and realized that just "passives" was not going to cut it. After I saw your comment I though that maybe there was a way to do the ADC without the voltage comparator but I could not find a solution for this. Is there one? Am I not reading in the right places?

  Are you sure? yes | no

Starhawk wrote 04/05/2020 at 05:05 point

You're not getting away with having no ADC, that's for sure. The resistor ladder idea (you want the string one, not the R-2R one -- the R-2R ladder goes the other direction) converts a bunch of digital inputs (1s and 0s) into an analog input (a variable voltage). It's like reverse PWM, in a sense, because the voltage steps up/down in discrete levels but there's still a ton of room for variance. It's also one really intense way to get cozy with Ohm's Law ;)

The NOS parts are slow basically because they're old. The newest thing (as far as I'm aware) in ADCs and DACs is what's called a Flash ADC or Flash DAC. The "Flash" term has nothing to do with the similarly-named memory tech -- it refers to the fact that these converters work "in a flash" -- i.e., substantially faster than the old stuff, to the point of being almost instantaneous in operation *from the circuitry's perspective* -- which is something of an achievement when you consider what that means. Circuitry operation typically happens in nanoseconds if not even smaller units... watch this (https://www.youtube.com/watch?v=ZR0ujwlvbkQ -- sorry, there *was* a shorter version but it got yanked) if you want a bit of perspective on that, from someone uniquely well-equipped to provide it... ;) that lady is a legend, and rightly so.

...but the point is, those old parts are NOT Flash ADCs because Flash ADCs didn't exist yet. Those chips are a bit like sitting down with Fifties Guy and a mechanical adding machine, versus a modern teen with a TI-83+ for the Flash versions. They'll get the same result, sure, if they're working right -- but it takes the old gear that much longer because tech then simply isn't like what we're used to now. (...which, oddly, can be kind of entertaining to play with, although I suspect we might have similar interests there -- I have a remarkably nice pile of old stuff, going all the way back to the Commodore era...)

  Are you sure? yes | no

ciplionej wrote 04/08/2020 at 00:06 point

Hi Starhawk,

Just to let you know that the project is now complete. At least, according to the objectives I had for it. Thanks for your help.

Your ideas gave me a lot of food for thought, and all the way to the end your remark "You're not getting away with having no ADC, that's for sure." was just resonating in my head. 

After it was all wrapped up, an alternative came to my mind. If I translated the pictures from gray intensity (0 to 255) to black and white (0 to 1), then I did not need to calibrate the LDRs never again and they could supply me with digital signals instead of analog ones. This means that I can run the nodes on the decision tree without an ADC!

I wrote a final project log with the models using this binary solution, no ADC, no voltage comparator, no fiddly LDR voltage levels, and the theoretical accuracy is even better than with the previous solution. Success!

Once again, thank you for your the support!

  Are you sure? yes | no

Starhawk wrote 04/08/2020 at 00:09 point

Well, what else can I say...

Well done! :D

  Are you sure? yes | no

thegoldthing wrote 03/28/2020 at 02:30 point

Very interesting project. Have you considered using an optical mouse for your sensor? Something like the ADNS-2620 has an 18x18 pixel array that can be accessed via i2c. Only problem is you might need some optics or very small letters.

  Are you sure? yes | no

ciplionej wrote 03/28/2020 at 09:57 point

That is a very good tip. Went through the TDS of the ADNS-2620 and it looks reasonably straightforward to implement. This led me to think of another camera that could be used which is the finger-print sensor cameras. These are communicating via UART instead of i2c.

There are two caveats from the point of view of this specific project when choosing to use a camera. First is the communication protocols, they all add time to the process and I want this to be not only simple but also blazingly fast. We're talking ns to ps response times for the voltage comparators and the slowest component, the sensors with 20ms to 30ms.

Second is the amount of information. The amount of information that is not needed in order to recognize an object is mind-boggling. Once you train a model, it'll tell you specifically what pixels to look at in order to tell objects apart. If you collect more information, it'll just be ignored.

There is no doubt that your suggestion would make a breeze of implementing this once the decision tree starts to grow. I'll keep it in my short list for the final implementation.

Thanks for the tip!

  Are you sure? yes | no

Dan Maloney wrote 03/27/2020 at 20:14 point

I really like the idea of minimalist ML. It's so simple now to throw as much horsepower at ML applications as possible, with Nano and Pi 4 and all that. Seeing it reduced to the minimum will be a real hoot. 

Personally, I'm pulling for the 555s...

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates