Close
0%
0%

Bitlang

Minimalistic language for Turing machines and beyond.

Similar projects worth following
Bitlang is a minimalistic procedural/object-oriented language that's easy to parse (easy to write backends, tools/analyzers or porting to other languages). The language is designed to be heavily optimized at compile time. While it mostly targets software binaries, it should work for programmable hardware too (not prioritized, so there are better alternatives). Another use is the popular trend of uploading parallel tasks to the GPU.

Software can communicate and be deployed to hardware, therefore I consider this a valid THP Entry. The hardware running this will also be able to communicate with other hardware or other parts of hardware. To prove it, a simple Bitlang to Arduino Sketch compiler will be included. Beware, we WILL be blinking LEDs!

Bitlang is dual-licensed as MIT and 3-clause BSD.
BitlangCompiler uses the various licenses of LLVM (see their webpage).

https://www.youtube.com/watch?v=71GiHV5PnZg (tip of the day: read below instead)

Language Characteristics:
- The language has one built-in type: the boolean type called 'bit'. It also has one built-in operator, the ternary operator: a ? b : c. These work just like in C or similar languages.

- References, like pointers but neither can you change them nor can you assign NULL to them. Need NULL? Borrow from the functional world and use Option, like so: http://alvinalexander.com/scala/using-scala-option-some-none-idiom-function-java-null . Remember, we will optimize this for you (e.g. it becomes a regular pointer that can be null in the compiled sources). We'll also give you a nice memory-manipulation API if you like playing dangerous, unless you're in a sandbox environment of course.

- Also, there is support for tuples, that can contain an ordered set of bits, references or other sub-tuples. With some imagination, these go beyond simple vectors and matrices, becoming complex types as described below.


- The block:

The block contains code. It's like a C-like function, but without a name. In other words, a lambda-function.
-- Degree 1: 1^{ config.load userFile } or just { config.loadUser }.

-- Degree 2+: Macros-on-steroids that run at compile time: 2^{ config.mainSymbol }.

-- Degree 0: Strings: 0^{ abc } or just "abc" are like C-strings, but only available to degree 2-blocks and beyond. The first one being neat for embedding other languages, like SQL. It's up to the programmer to implementing C-strings, UTF-8 arrays, sprintf, shader pre-compilation or whatever suits the application. However, we give you a standard library with UTF-16 as standard.

Blocks will be able to take variables as arguments and return data. If these variables are supposed to be saved somewhere, that has to be specified (it means the compiler has to use another allocator than the internal stack-based one, like the malloc allocator in the standard library or the GPU-buffer allocator in OpenGL).

- Types with symbols. While recursive tuples make up the data structure/composition, the most important part of any non-trivial program is organization. Types links tuples of bits and references to symbols. A symbol is either a method or an operator (they're mostly the same in Bitlang). The symbol consists of a degree-1+ block. If it is a degree-2+ block, it is also specified if a value or an expression is needed. If it is an operator, it is specified if it's binary/unary and which precedence it takes. Further information on this subject will come in a separate post.

- There will be a scope-system (for variables) similar to the JavaScript prototype chain. Scopes can be bound by higher-degree blocks for method calls. The default is what you would expect, inline-blocks inherit from outer blocks. But, the scope may be replaced or mixed in for some contexts. It's important this can only be done in a way that cannot confuse programmers, maybe by using a prefix or similar. Type systems can use this to create magic this, parent/super etc. variables.

The Standard Library:

- Contains most of what's considered "the core language", in other languages. For example, a standard implementation of classes goes in here (but you can use your favourite implementation if you want, like ObjC's or JavaScript's). Another example is the integer types, uint64 and the like. Of course, these are heavily optimized at compile time.

- The problems goes into the library rather than the spec. - we can throw out problematic features without ruining compiler infrastructure in the future. Check out Google's C++ style guide and you will have a good laugh, it's mostly about the hundreds of features you're not allowed to use. Also, check out the numerous Java Collection frameworks out there that struggle with generics, or the dependency injection libraries that almost work with a lot of runtime reflection magic. Then look att the success of the Boost library. It's easy to see that flexibility matters.

- A clear API. Remove the confusion! Ruby does this good. No abbreviations, good convention like...

Read more »

  • Youtube Video

    JoelS08/21/2014 at 01:09 0 comments

    My mic failed on me so you will have to watch the annotations. Or, better read the description on this page instead. I will try to fix a better video later on when I get sound working, but I need this one for the THP Entry submission right now.

  • How about higher-order blocks?

    JoelS08/20/2014 at 23:58 0 comments

    When you read this article, remember the reason for doing this:

    * Specification

    * Code analysis

    * Testing: We can tell the compiler to not call that block (function) after all, hijacking it or spying on it.

    * Keep the language small at all costs and put everything into the standard library. Writing new compilers becomes easy. Sometimes you don't need speed, you just need to get it working on a new platform (even if the examples below are extreme).

    * The development of the Bitlang Compiler will be easier. (We'll use a very simple parser + LLVM.)

    Higher-order blocks aren't really C-macros or C++ templates / Java generics. They're more than that and need to be powerful. For example, let's consider this:

    if(a() || b()) {

    ...

    } else if(c.blah) {

    ...

    }

    When we implement this, there are a few things to consider. If a is false, we don't want to evaluate b. Likewise, if the first if fails, we need to do the else.

    To implement this, we could define (in pseudocode now):

    first = 2^{ |first, second| return first }

    if = 2^{ |bit expression, block run|

      return ELSE(expression ? first(false, run()) : first(true, nothing()))

    }

    It gets a bit ugly, but once we have if, everything will be nice in the future. Also, note that this is the specification. The heavily optimized standard library in the compiled code will look more or less exactly as if a C compiler produced it.

    So, "if" is a named block that evaluates the block if expression is true. It has two parameters. Everything works nicely, we run if the evaluated expression is true and return some kind of else-type. The else type (as well as the logical-or operator || that need to skip evaluation of b() if a() is true) is harder. Here, we need to be able to take an expression and evaluate it at will. This can be used in two ways:

    * Branching. This is where conditional branching comes into the picture. For not-yet optimized compilers, it happens in the ternary expression. For optimized ones, it happens right in the source code as "branch if xxx".

    * Analysis. A simple example is hijacking 0^{ this is a %s string } or "this is a %s string" and return the right type. More examples would be in testing where we need to stop certain things from happening.

    Things get more complex, because we can have higher-than-2 degree of blocks too, i.e. blocks than run inside the blocks inside the compiler/tool. That will be very handy for writing the compiler that interfaces LLVM.

  • Operator precedence

    JoelS08/20/2014 at 23:24 0 comments

    Operator precedence is important for some applications, but it can also be dangerous. As an example, what does (1 + 2 % 2)  or (2 * 3  % 2) do? I've been bitten by this one before. For Bitlang we can't really define operator precedence, since that's up to the developer.

    Let's consider some cases:

    -3 + 5 <= It doesn't matter much here, but it would be good if the - was applied first.

    3 - 5 <= We mean 3 + (-5), but cannot assume that here. What if order is important for some type?

    3 * -5 <= Hmm, tricky one. We have two operators here but that could be one (&&, *=, << etc have two chars.)

    right.index - left.index <= We mean (right.index) - (left.index), not ((right.index) - left)

    1 + 3 * 5 - 4 <= Important to get the order right.

    1 / 4 * 5 <= We mean (1 / 4) * 5.

    We can study this as mathematics but that's not really needed. Let's just keep it simple and categorize the operators into two groups: greedy and non-greedy. Addition is an example of a greedy operation since it eats the whole multiplication-expression, while * is a non-greedy one as it doesn't eat the -4. Now, we can say that operators that begin an expression is left-unary and sticks to the immediate operand next right to it. If the operand is a variable or symbol on the other hand, the following (if any) operator is applied - UNLESS it's a greedy one, as it then have to wait for all non-greedy ones following.

    How do we solve the problem where we mix +- and */? It will solve itself. But, what about the ambiguous %-operator? What about the &&, || or == as commonly used in other languages?

    (a + b == c * d && e * f == g + h)

    We don't want to add parentheses here. How can this be solved? It actually is solved, it works given that greedy operators eat other greedy operators - with the order being left-to-right. 1 + 2 + 3 would be computed as 1 + (2 + 3). This is something you have to deal with if you're greedy, and it's commonly reflected in mathematics too: multiplication might be order-dependent, but addition/subtraction usually isn't. And if it is, you need to have parentheses.

    (a + b && c + d == e + f && g + h)

    Some moron came from C and wrote this, and now his code doesn't work. Also, we have the %-problems to deal with. Ok, here are three ideas: Either bucket everythng up in more than one bucket (like conventional languages do) - we could even have one bucket for % and the like that gives warning unless you have parentheses explicitly. Or, make some operators friends, and ban others - need for parentheses all over the place. Or: Have predefined buckets, and remove % and call it .modulo with .isEven and .isOdd shortcuts. Warning on every ambiguous use that needs parentheses. This also prevents people from abusing operators. I like this last one best!

  • A description was added

    JoelS08/20/2014 at 22:25 0 comments

    A description to the project was added. The text is put into the project description instead of here. That way, you can always read the most updated version.

View all 4 project logs

Enjoy this project?

Share

Discussions

jaromir.sukuba wrote 04/03/2015 at 19:17 point

Any news about this project? Seems like good idea.

  Are you sure? yes | no

JoelS wrote 06/29/2015 at 22:54 point

Life got in between... But one day I will hopefully have time to continue the project ;) Successfully developing a language will take much time (making it successful in the community even more so). I have some ideas about the type system etc. that I hope is a winning concept, things I've never seen in other languages but that I hope will support one too often neglected part of software development: code maintenance. In the mean time, maybe someone else can use the ideas. I think the "higher order functions" is a real killer feature (if its not abused by developers)!

  Are you sure? yes | no

Anderson Antunes wrote 08/21/2014 at 01:47 point
Great idea!

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates