Sunday, October 29, 2006

Hyper compiler 0.3.30 released

I've just released a new version of the compiler for Hyper. You can download it from here.

A short list of changes in this release:
  • highlighter now uses the same style as the Hyper website
  • copy constructors are now checked and auto-generated if needed
  • 'inout' is now used for parameters that used 'var'
  • the 'this' keyword is now supported in expressions
  • access specifiers in a namespace are no longer supported
And next to these changes, the compiler has been restructured internally a bit as well.

Friday, October 27, 2006

begin specification

Something that is not documented yet on the website is the begin directive. It appears on top of the source file, to indicate the entry point for that file if it is compiled as an executable. You specify the class that should be 'started', the class that contains the static procedure called `main'. This procedure will be called when the executable is run.

It's use was mandatory, but I am now making it optional in one case: when the file contains only one class. Because in that case the user's intention is obvious. This simplifies the Hello World example:
namespace Example
class Hello
static procedure main()
system.out.printLn("Hello, World!")
end
end
end
This Hello World program will change even more later. Namespaces will be removed and some module system, or a system like Java's packages will be adopted. And the function (or 'procedure') to output text to the console will probably be renamed.

Wednesday, October 18, 2006

Concatenation & pass by reference

I am wondering how I will allow strings to be concatenated. The string type currently uses the + operator for this. But I am thinking whether or not to introduce an operator especially for this purpose. I suggest the ~ (tilde) as an operator (as it's not used yet and D also uses it for concatenation). An accompanying ~= operator can be introduced for concatenating to the back of an existing string.
var s : string = "How" ~ " are " ~ "you"
# s = "How are you"
s ~= " today?"
# s = "How are you today?"
This would make an interesting syntax possible for output, like C++ has the << operator. You could have an output stream that has the operators ~ and ~=. This can make the following possible:
procedure writePoint(inout s : FileOutputStream, x & y & z : real)
s ~= s ~ '(' ~ x ~ ", " ~ y ~ ", " ~ z ~ ')'
end
It's not ideal because s is still used twice. But maybe it's a start :-)

I am also thinking about pass by value or by reference for 'in' parameters. As I wrote previously, 'in' parameters cannot be changed at all by the callee. So in my opinion it would be a nice idea to use pass by reference for this. It would make 'in' parameters fit in more, because 'inout' parameters are also passed by reference and 'out' parameters use a reference to write the result directly to the location of the caller.

Sunday, October 08, 2006

Compiler goes public!

Yesterday I put the compiler sources on my website. The compiler is finally available to everyone who wants to take a look. You can download the sources from here. The released version is 0.3.29, a release that already dates from mid-september, but was only available to some limited test audience. I got almost no feedback, so I figured I could make it already available on my website to have a larger potential of testers.

As said on the download page this is a development release so it probably has lots of bugs, and on top of that it doesn't support all 'official' language features yet. It also requires CMake to build. You will have to build it yourself from source because no precompiled binaries are available.
Another thing you need to know is that the compiler only accepts filenames in Unix format. So Windows users will have to use the compiler under Cygwin until Windows filenames are supported in some future release.

If you are a programmer, I hope you will try it and give some feedback! For information about the language, see the language reference on the website. The documentation you find there is not really complete at this time and is sometimes very brief but it will give you a start.

Thursday, October 05, 2006

Various new features

On the language and compiler status page, I still have some unimplemented features left that I did not explain here yet. I will explain as much as I can in this writing.

First 'initializer lists'. I will assume that you know initializer lists from C++. In hyper they are not very different. The only (semantic) difference is that Hyper allows fields to have an initializer in their declaration and C++ does not. In a constructor, to initialize a field the value in the initializer list is used if it's present. If it is not, then the initializer in the declaration of the field is used. No initializer there assumes the use of a default constructor to initialize the field. The syntax consists of lines starting with a colon, it resembles the way it looks in C++, although C++ requires it to be on one line and Hyper allows to spread it across multiple lines for readability. An example to illustrate all this (unfortunately this blog thing eliminates all indentation from my example):
class Test
private:
var a : string = "SSS"
var b : real = 9.81
var c : int
var d : bool
var e : char
var f : * byte

public:
procedure new()
: a("Cookie")
: c(16), f(new byte(13))
# Other code
end
end Test
In this example a gets the value "Cookie", b is initialized to 9.81, c becomes 16, d is initialized with its default constructor (which initializes it to false), f is initialized to a pointer to a byte with value 13, and e gives a (compile time) error because char doesn't have a default constructor (yet?).

Another feature: comparison operator chains. This allows the mathematical notation of comparison operators that are used together: 1 <= 20 = (19 + 1) < 75 etc. And this notation supports the meaning that is obvious to someone who has never programmed before; a = b = c is not the same as (a = b) = c, but is equivalent to a = b && b = c. The difference is that in a = b = c, all expressions are evaluated once (b is not evaluated twice as in a = b && b = c) and the order of evaluation is undefined. a is not necessarily evaluated before c, and the comparison of b and c is not necessarily after the comparison of a and b.

Hyper will also provide two kinds of enums: nominal and ordinal ones. For now only nominal ones are explained as I haven't really figured out a syntax yet for the other type. Enums are strongly typed in Hyper. They are not a named alias of constants of some numeric type. Nominal enums represent named, uncomparable values. They are in their own namespace so you need to explicitly refer to the enum values by the name of the enum. Enum values are just listed, separated by comma's or newlines. An example of an enum declaration:
enum Transportation
Car, Bus,
Airplane, Subway
Boat
end
An enum can be declared anywhere a class declaration is allowed. The first member is considered the default value for variable/field declarations. I am also looking for a different approach, that there is no default and that if you want one you will have to specify which one it is. Remember to use Transportation.Bus instead of just Bus, because explicit qualification is required.

That's it for this time. New features I still have to explain (later):
  • single inheritance
  • modules & imports (still requires some thinking)
  • copy procedures (not on the website yet)

Sunday, October 01, 2006

'in', 'inout' and 'out' parameters

This is an idea that I'm thinking about for some time. Hyper currently uses a parameter system that was borrowed from Oberon (I am not sure if its ancestors Modula-2 and Pascal already supported it) and Visual Basic. There are now 2 kinds of parameters: normal ones and variable parameters. Normal ones are passed by-value and variable ones are passed by reference. Now this would be a perfect parameter system if it was used for a language without pointers. An example from the Hyper docs on my website illustrates the problem:
procedure isCellEmpty(x & y : nat, m : * [ ] const * [ ] const * const int) : bool
return m[x, y] =$ null
end
So many const keywords just to make sure that no contents of the matrix are modified! Hyper needs a parameter to be declarable as 'input only'. Such a parameter would not need any const keywords, because for an input parameter it is obvious that no content can be modified:
procedure isCellEmpty(in x & y : nat, in m : * [ ] * [ ] * int) : bool
return m[x, y] =$ null
end
So a keyword in could be used for input parameters. The safest solution would make this behaviour the default for all parameters that don't specify another option. This would make the keyword in redundant. The other option would be an input/output parameter. It would pass by reference:
procedure makePositive(inout x : int)
if x < 0 then
x = -x
end if
end
And the const keyword can be used for data that is not supposed to be changeable:
# make sure that x points to the string to come first
procedure tinyAscendingSort(inout x & y : * const string)
if y < x then
var t : * const string = x
x =$ y
y =$ t
end if
end
Another useful feature would be 'output only' parameters. I am not sure whether or not to include them, because they would behave very differently from other parameter types. They would also use an implicit reference to pass through their changes. In my opinion an output parameter should be initialized by the function that assigns it a value. This requires a new statement, an 'out' statement. Such a statement assigns a value to an out parameter. Every return path of a procedure must have an out statement for each out parameter. Example:
procedure selectBestCandidate(c1 & c2 : * Candidate, out best : * const Candidate)
if c2.betterThan(c1) then
out best c2
else
out best c1
end if
end
Of course output parameters are best used for multiple output parameters, because otherwise you could just use the function's return value. Another example:
procedure getMinMaxAvg(i & j : int, out min & max : int, out average : real)
out min = (i < j) ? i : j
out max = (j < i) ? i : j
out average = (real(i) + real(j)) / 2
end
An out statement initializes its output parameter by using its constructor, so the specified value is like an intializer for a variable. We also need a special function call syntax for output parameters. This is because an output argument is not an expression but a variable declaration:
procedure p(x & y : int)
getMinMaxAvg(x, y, out smallest, out largest, out middle)
var diff : int = largest - smallest
end
The declaration of an output argument happens in an expression. To avoid dependencies on expression evaluation order, an output argument can olny be used after the entire expression (so that it isn't a subexpression of something else) is evaluated.

As said earlier, I am not really convinced about the current idea of output only parameters, but the input and input/output parameter ideas are good enough for me.