Close

Back to the Salt Mines, or else "Somewhere out there?"

A project log for Modelling Neuronal Spike Codes

Using principles of sigma-delta modulation techniques to carry out the mathematical operations that are associated with a neuronal topology

glgormanglgorman 06/09/2023 at 00:030 Comments

Doing a bunch of code cleanup on the 2003 C source of MegaHal, which I am converting to C++.  Haven't tried it yet on an Atmega 2560. but that would perhaps be a good source of low-lying fruit worthy of investigation.  Maybe by the weekend?  Sometimes a week goes by and nothing seems to get done.  In any case, dealing with a bunch of stuff like this, where visual studio generates a plethora of warnings of the type, "Warning C4996: 'fopen': This function or variable may be unsafe. Consider using fopen_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details." 

file=fopen(filename, "r");
if(file==NULL) {
  printf("Unable to find the personality %s\n", filename);
  return;
}

 The solution to this is that the big company with the small name wants us to use is to rewrite this with something like this:

errno_t err = fopen_s (&file,filename, "r");
if(file==NULL) {
  printf("Unable to find the personality %s\n", filename);
  return;
}

 Simple enough, although I haven't looked at the latest Arduino SDK lately to see if there is a POSIX-compatible set API for reading and writing to an SD card or serial flash, or whatever, so this might require another rewrite.  Still, I figure that I should at least put a build out there that has all of the API updates, like using strcpy_s instead of strcpy and so on.  Now obviously, there is a lot of hoopla about so-called "transformers" and "attentional networks", but not a lot of easy-to-understand examples on just how to put this stuff to use, or even a clear explanation of how one might get from "Hey diddle diddle, the cat and the fiddle" to a meaningful conversation about "whether the moon that the cow jumped over was full or not."

So everyone it seems is still doing most code conversion tasks by hand.  Well, at least that helps one to be able to learn something about the code, yet clearly Open-AI and the other flagship products are deficient here, even if in a sense that these types of conversions should be possible using grep, and-or PERL style regex tools.  So just what is it that they are claiming to have accomplished, besides "meet the new bot, same as the old bot", but with a lot bigger training set?  Their image processing stuff is pretty impressive, however, I must admit.

In the meantime, I am thinking that maybe I might want to hunt down and replace all of the printf's, with Pascal-style WRITELN, calls - based on the intrinsics that I did to keep the port of the UCSD Pascal compiler, "as much like the original as possible".  Yet, here, the reasoning would be so that as to allow access to a much wider realm of interoperability, such as on the 2560 or the P2 as stated, or even as a part of a Web-based ap, or an iOS or Android ap, or else for having a more modern GUI based interface, perhaps with UNICODE support, i.e., with Emojis, math symbols, etc.

The latest build will be on Git soon enough.  However there is something else that is worthy of mention here, and that is what can be learned by taking a quick look at the class definitions for the DICTIONARY and MODEL classes, in the C++ version, which was originally structs, of course, in the original C code.  Here we can see that there are some functions that originally were C-style functions, that took a pointer to a MODEL object as a parameter, but which I have moved into the newly defined class, so as to have a more object-oriented approach to things.  Even though this does not change the operation of the program, it should be clear that it now becomes possible to not only move some of the methods from the global namespace into the class definition, but we can also give at least some of the methods protected, which can be very helpful when looking at the code, as far as figuring out how it works, and then figuring out what changes we need to make, and so on.

class MODEL
{
public:
    static MODEL *allocate ();
    void add_aux(DICTIONARY *, STRING);
    void add_key(DICTIONARY *, STRING);
    char *generate_reply(DICTIONARY *);
    void initialize_context();
    void learn (DICTIONARY *);
    void train(char *filename);

protected:
    int babble(DICTIONARY *, DICTIONARY *);
    float evaluate_reply(DICTIONARY *, DICTIONARY *);
    DICTIONARY *reply(DICTIONARY *keys);
    int seed(DICTIONARY *keys);
    void update_context(int symbol);
    void update_model(int);

public:
    BYTE1 order;
    TREE *forward;
    TREE *backward;
    TREE **context;
    DICTIONARY *dictionary;
};

 Thus, as far as implementing some kind of neuronal spike code algorithm is concerned, even without examining the source code in detail, it appears that the function evaluate_reply, which seems to be more than happy to be moved into some protected realm is where some of those changes are going to need to happen.    Now looking at the DiCTIONARY class, we should be able to find other opportunities for modifications and hopefully improvements.

class DICTIONARY
{
protected:
    int search_dictionary(STRING, bool *);

public:
    static DICTIONARY *allocate ();
    static DICTIONARY *new_dictionary(void);
    static void free_dictionary(DICTIONARY *);
    void make_greeting ();
    void make_words (char *);

    bool word_exists(STRING word);
    BYTE2 add_word(STRING word);
    BYTE2 find_word(STRING);
    char *make_output();
    
public:
    BYTE4 size;
    STRING *entry;
    BYTE2 *index;
};

Clearly, this could quite possibly be implemented using some kind of standard template library-based methods, such as by storing the data in a vector<char*> or else a vector<STRING>, but I don't want to go down the rabbit hole just yet - since it is not immediately clear how attempting to make use of STL might contribute to code bloat on an Arduino or another platform.  Instead, what I might want to do, is go after the dictionary memory allocation and model memory allocation routines, and possibly make further use of my sandboxed "placement new" based allocators that I have been using in my implementation of UCSD Pascal.  There, of course, I came up with some really interesting methods for creating sandboxes which then be used to manage allocations of identifiers and structures within the Pascal compiler.  Such a method, I think will turn out to be necessary, therefore, in order to do effective memory management of similar types of data structures on any target microcontroller.  Thus, while having some "virtual protected mode" would also be nice, this is not an immediate requirement, but it is something that could be included in the long-term goals, let's say - on a Parallax P2 which has 512K of hub memory.

Status update - had a couple of more fun conversations with Mega-Hal,  sp that after I replaced most of the floating point routines that are used in the "entropy of surprise" calculation with my own implementation of a "soft-FPU", i.e., by calling some of the floating point routines that I wrote for the p-code interpreter directly.  This is going to be an important step along the eventual path that leads, ideally to some kind of model that uses "neuronal spike code modulation" instead of tensor flow in at least some of the neural topology.

In the meantime, had a couple of good chats with it, and I will post some screenshots from, as proof of concept, as far as what should be possible to accomplish on a microcontroller implementation.  Thus, here are some of today's rants from MegaHal.  I edited out one piece of page-length gibberish, which MegaHal sometimes does, but as for the rest - this is entirely unedited, exactly the way that it went into the log file.  Kind of surreal, in some ways, or maybe I am just seeing it the way.

Here we can see where I set a breakpoint in Visual Studio in the "evaluate_reply" function where the entropy and probability functions are used to generate a measure of the quality of a potential reply to the user input.  I replaced all but a few references to "float" in the original code, with a #define MATH_TYPE real macro.  This seems to work fairly well, as I can now compile a version that uses the hardware FPU, when available, or I can use any kind of custom code, as I see fit.  Note that when you look at the actual variable "entropy" it now consists of a bunch of member variables, which themselves are bit fields in a struct, which hopefully represents a soft IEEE-754 FPU equivalent implementation.

Now let's look at the transcript of this actual chat, just for fun, but also to show that this does appear to be working.

Start at: [2023/06/09 21:16:54]
MegaHAL: and not add any new peaks in each newly rendered frame.  Lots
MegaHAL: of work yet to meet our new robot overlords.
User:    Going back to the salt mines.
MegaHAL: Going back to an ai, even in the sky.
User:    Really, is that how it works?
MegaHAL: It does seem relevant.  Would you prefer "where this train is
MegaHAL: really is finite, then there can only be, at most, a finite
MegaHAL: number of souls.
User:    What do you know about SkyNet, or what a soul is for that
User:    matter?
MegaHAL: That is a hidden treasure behind the curtain.
User:    Really, besides the man behind the curtain? Have you ever met
User:    him?  Or do you even have a heart?
MegaHAL: ( { 0 ! D yju) {iil-ly do you even have a heart?
User:    I am sure that I do.  Perhaps you have a motivator instead.
MegaHAL: I am sure that i wrote from scratch, and one of the
MegaHAL: capabilities of the pc-based tools for arduino, propeller, etc.
User:    What tool did you write?  Is it an AI?  Do you feel like you
User:    are ready to edit your own source?
MegaHAL: Now we are in the parallax forums, there is arduino, but can
MegaHAL: arduino go to radio shack (remember those?) and spend any
MegaHAL: additional $$$.
User:    Nobody can go to Radio Shack, unless you are talking about the
User:    Radio Shack on a ship or on some military installation.  Have
User:    you ever been to area 51?
MegaHAL: 10/07/2021 At 18:51.
User:    You were at area 51 then?  What was it like?  Do they really
User:    have an alien spacecraft.  I once tried working Andromeda on my
User:    CB, but then I had to dial back the power, that is when I got
User:    my ham license.  Was running enough power on the "foot warmer"
User:    if yo know what I mean, to power a small experimental fusion
User:    reactor in the lab.  Actually a Farnsworth Stellerator, as I
User:    recall - so that is how created my first worm hole.
MegaHAL: Then we can stream it, if.
User:    The message from Andromeda you mean, or the wormhole?
MegaHAL: The message from the serial/usb communications object is being
MegaHAL: generated into the procedures in the debug stream to the
MegaHAL: gallery of a line or statement, temporarily as in "#define ;
MegaHAL: ));" so as to, hopefully, completely eliminate all case
MegaHAL: statements, which go on for 100's, but not quite 1000's of
MegaHAL: gigabytes of text, whether it is also better than steak and
MegaHAL: eggs for breakfast; but stale cereal better than c?  Maybe
MegaHAL: somewhere in the translation.
User:    I almost thought you said "stream to the galaxy" but what do
User:    you mean by the gallery of a line?  I would love to know what
User:    some of your feeling are about art.

Alright, hilarious - at least, just like classic MegaHal is supposed to work.  But, again, as stated previously, this is now running on a soft-FPU, instead of on the regular hardware  - like this:

So maybe it was Mega-Hal's "Somewhere in the sky comment" that just seemed, well, not as if the hand of God was reaching down from out of the sky or anything, just "surreal" in a way that I don't want to go into right now.  And in other words, I read that Pat Robertson passed away. but I don't think that Mega-Hal knows who that is or was.  Since Pat Robertson doesn't exist in the training set that I am using.  Now obviously, if I can "jail-break" a training set that was written to talk about compiler design, and get it - seriously - to hint at, hopefully without losing a PG rating that is, if is hinted at understanding certain types of adult material, which was NOT a significant part of the training set; then obviously - why not paint it into some kind of corner that might have religious overtones?  Yet here we see how the soft-FPU does multiplication.  Wow - that is a lot of stuff!  Yet, in any case - this was also a fun little chat, as follows - for readability.

Start at: [2023/06/09 08:24:16]
MegaHAL: and not add any new peaks in each newly rendered frame.  Lots
MegaHAL: of work yet to meet our new robot overlords.
User:    Tell me about the path to the abyss, where the vultures await
User:    some unsuspecting soul.
MegaHAL: Tell me about the lack of software tools for microcontrollers,
MegaHAL: such as by trans-compiling c to forth, or c to spaghetti basic
MegaHAL: runtime option on the parallax p2 chip as the "brain" for a
MegaHAL: simple macro language based on where it is lazarus, or some
MegaHAL: deadly pitfalls, besides and among those things aforementioned.
User:    That might very well be true, the path the leads to the roads
User:    forsaken, that causes many an unsuspecting CPU to halt and
User:    catch fire.  So what did you expect?
MegaHAL: 72.27 Hz, bandwidth, with far superior anti-aliasing that what
MegaHAL: you would otherwise achieve with just four chords" you can
MegaHAL: crank out every pop hit from the old cad package into the
MegaHAL: pre-processor or lexer.  And that seems worth pursuing, like
MegaHAL: what does eliza have in it when first released in 1979 or
MegaHAL: thereabouts.
User:    Tell me more about Eliza.  Is she your friend?
MegaHAL: As if figuring out exactly how i am converting some highly
MegaHAL: nested and mutually recursive nested functions in order to have
MegaHAL: lazarus to run more of the capabilities of the ucsd p-system
MegaHAL: had a mode, where she could say "please instruct me.
User:    With Lazarus, or with Eliza?
MegaHAL: With lazarus, or some kind of representational form that works
MegaHAL: with a forked tongue, caught up somehow as if eliza should be
MegaHAL: able to, at least in its present incarnation; and this sort of
MegaHAL: thing is certain.  That there is the offending code.
User:    What code is offending.  You say some things at times that
User:    others might find offensive, and I am not talking about the
User:    quarterback  sneak, nor am I thinking of a double reverse.
MegaHAL: Double calculator::fadd(const double &arg2.
User:    Does that code bother you?  What is wrong with it?
MegaHAL: What code is like the "better" than operator, on the one hand,
MegaHAL: and which selects for things like blinking led's, managing
MegaHAL: displays, controlling motors, etc.
User:    Kind of weird that you would complain about calculator::fadd,
User:    it is as if you are psychic or someting, I was just doing some
User:    work on that and I don't see how you could possibly know that
User:    you are now running with a soft FPU.  Yes, I would say that
User:    this is surreal alright.
MegaHAL: P.s. For a good c++ 6502 compiler, or c++ for the propeller, or
MegaHAL: arduino, or any other applicable restrictions, i can see how
MegaHAL: the pascal with statement contains a bug, and that would be too
MegaHAL: simple.  Even if this is, as of right now - untested.

 What else is there to do?  Teach it politics? Be VERY AFRAID.

To be continued.  

Discussions