Wednesday, December 27, 2006

progress...

It's been a while since I wrote something the last time. But I have made progress since then.

First of all, the Hello world example has changed again. Here is the new (and hopefully final) version:
import system.stdio

class Hello
static procedure main()
system.Out.line("Hello, World!")
end
end
The 'StdIO' class has been replaced by 'Out', which will provide simple console output. The 'printLn' procedures have been renamed to 'line'.

Implementation of the compiler is now much further. The compiler on SVN trunk now supports the new namespace system, and supports importing 'system.stdio'. This means that all sample programs in the directory tests/programs now compile successfully. A new fun feature is that the compiler detects TODO and FIXME in comments and emits warnings for them. Of course new compiler options are provided to turn these warnings off, but they are enabled by default.

The compiler has been restructured internally as well. I have completed 3 major refactorings. But this is not the end of the road, much other improvements will be done in the future.

There is a good chance that the next release of the compiler will have version number 0.4.0 because of all the improvements that have been done. The milestone for 0.4.0 will then be "front-end mature enough to start implementing the back-end".

I have also created a new SVN branch where I will start to work on the compiler back-end. As said earlier I will use LLVM for this. I have imported LLVM 1.9 into the branch. The first thing to do is to get LLVM compiled with CMake, as the LLVM developers use GNU autotools to build it. But I don't, I use CMake for the front-end. Then I will try to get the Hello World program compiled with it.

Today I have written a bunch of docs again. You can find class references for the built-in types on the website.

I currently use Subversion, but I am thinking about switching to Bazaar.

Thursday, December 14, 2006

Hyper compiler 0.3.31 released

I have released a new version of the compiler. This release is fairly bigger than the previous one. It contains more new features and has undergone internal structure improvements.

The compiler now checks for the presence of return statements in procedures that return something. Another very important new feature is public/private access checking. Some small things are not checked yet, like for example the usage of private conversion constructors for passing arguments to a procedure. And const checking is now complete (at least to my knowledge), which means that the compiler does additional checks for const procedures and calling procedures on a const object.

Some things that are not listed in the changelog: a new test program was added, the Hello World test program. This already uses the new import and namespace semantics so the compiler currently rejects it. And the compiler now allows import directives but currently ignores them.

Monday, December 11, 2006

dynamic arrays and array sizes

Arrays are supported for some time, but dynamically creating an array wasn't possible yet. Time to change that. The syntax is simple:
procedure xxxx(a & b : nat)
var x : * [5] int = new [5] int()
var y : * [][] real = new [a][b] real
end
As you can see, the syntax is "new" followed by the type of the array and an optional empty pair of parentheses. Dynamic arrays initialize there elements with their default value. The size of the new array must be fully specified (but for its elements this is not required):
 new [] int      # illegal
new [][10] int # illegal
new [10] * [] int # allowed
This brings us to the compatibility of array sizes. When pointing an array variable to an array, the sizes that ARE specified must be evaluatable at compile time and be equal. But you don't HAVE to specify the sizes of course. Open arrays accept any size, this means no size specified or a size that isn't known at compile time.
const globalC : nat = 17  # constant field

procedure test(i : * [] int, j : * [17] int, n : nat)
var a : * [9] int = j # ERROR: 9 != 17
var b : * [17] int = j # OK
var c : * [17] int = i # ERROR: unknown size of i
var d : * [globalC] int = j # OK, globalC = 17
var e : * [n] = j # ERROR : value of n unknown
var f : * [] int = i # OK
var g : * [] int = j # OK
end
For now you cannot specify an initializer for an array. That's why you don't specify arguments between the parentheses when creating a dynamic array. And that's why an array variable or field can't have an initializer part.

Monday, December 04, 2006

sourcefiles and namespaces again

I was a little brief on my previous post about sourcefiles, namespaces and imports. I'll try to explain it a bit more here. So here's an example of multiple sourcefiles working together.

# File "someDir/MyApp/GUI/mainwindow.hyp"
namespace MyApp.GUI

class MainWindow
# (...)
end


# File "someDir/MyApp/Data/store.hyp"
namespace MyApp.Data

class DBStorage
# (...)
end


# File "someDir/MyApp/Core/main.hyp"
namespace MyApp.Core

import MyApp.GUI.mainwindow
import MyApp.Data.store

static class Main
static procedure main()
# main program's entry point
var dbs : MyApp.Data.DBStorage
dbs.open()
var win : MyApp.GUI.MainWindow
win.show()
# (...)
end
end
I hope this clears it up. Every file is in a namespace. When you import a file you need to specify the namespace AND the name of the file you want to use. The classes inside a file are in the namespace of that file, so that's why the "main()" code uses the full names like "MyApp.Data.DBStorage" instead of just "DBStorage". To get rid of the long names I will support 'using' declarations (in fact aliases), but that's for later.

Something we don't support now is having public/private members in a sourcefile. The Main class of the example above does not have to be public. For now all direct source file members are simply public.

A program often needs to use libraries outside of its own codebase. Therefore Hyper will support some variation of the 'class path' concept from Java but with a different name. I suggest somethink like 'codebase path' or 'code path'. The default 'code path' will be empty and this means the compiler will only look at the sources you are compiling now (including the imported files from the same codebase). How about closed source libraries? I am thinking to use a concept similar to D's interface files (see the end of this page). This means having a second type of sourcefile that only contains the interface parts of each class (the procedure headers etc...).

Off-topic:
* I am tempted to have the compiler generate C++. This would be somewhat easier than using LLVM, but it would require an extra compilation to get a working program. It could be an acceptable temporary solution.
* The next compiler release will support public/private access checking, full 'const' checking and maybe also 'static' checking. But I am unsure about going for a small or a major release. A major release take much longer to be released, but allows for large internal improvements in the compiler. A minor release would be version 0.3.31, and a major release would be 0.4.0.
* Restricted pointers will definately be part of the language. I just don't know yet when to start implementing it and whether or not to wait after the next major release (0.4.0). I would like to have a better name for them. "References" is a candidate, but it could be too confusing for C++ users that don't know yet what they really are, since they differ a lot from the 'references' from C++.
* Strict in/inout parameters will probably be added when restricted pointers are already implemented.

Sunday, December 03, 2006

importing other sourcefiles

I think I have finally found a way to have multiple sourcefiles working together. I have based it mostly on the packages system from Java, but I don't call it packages anymore. I have decided to keep the 'namespace' keyword for this purpose. In Java each file that is not in the default package is in some specified package, in Hyper each file is in some namespace. This means that namespaces are no longer declared in blocks like classes are, but they are declared in one line on top of the sourcefile. It will also be possible to have a sourcefile that is not in a namespace; this will be useful for one-file test programs. More about that later. Each sourcefile is in a directory structure that corresponds to the namespace of that file (such as for packages in Java). So a file in namespace "Foo.Bar.Baz" could be named "Foo/Bar/Baz/filename.hyp". A file can import other files by specifying the namespace and name of that file, without the extension. So this will be something like:

namespace Abc.Defg.Stuv.Xy
import Foo.Bar.Baz.filename

There are public and private imports, and an import is private by default. If file 1 is publicly imported in file 2, then any file that imports file 2 will have file 1 imported with it. This is not the case if file 1 is privately imported in file 2. For private imports the compiler will have to check that there are no things from the private import exposed to the outside.

Imports are allowed to be circular; this means that file 1 can import file 2 while file 2 also imports file 1. Such things are of course to be used as little as possible. Disallowing circularity is not feasible because these things are not always avoidable, and the language currently does not allow for forward declarations as C++ does.

A sourcefile that is not in a namespace will not be able to import things from other sourcefiles but only from the standard libraries (system.****). And it cannot be imported by any other sourcefiles. This is to minimize its usage. Files not in a namespace are not in some 'default' namespace as Java does it, but aren't in any namespace at all. So there would be no relation to the directory such a file resides in.

The standard library will use the 'system' keyword as the root namespace. Standard input/output will be available with "import system.stdio". (I think I will use the convention of using a lowercase identifier for the name of a sourcefile) This file contains a static class "StdIO" with a number of procedures for stdout printing. There are "print" procedures for literal printing and "printLn" procedures for printing with an additional newline. This would make the "Hello World"-example look like this:

import system.stdio

class Hello
static procedure main()
system.StdIO.printLn("Hello, World!")
end
end

It sure looks better than the current version.