Posts Tagged ‘programming language’

Random thoughts about Sylph

March 4, 2015

Some random thoughts on Sylph: the programming language I want.

My own wishlist:

Build/configuration settings can be bad news.  I think this is mostly a problem for older, crusty compiled languages like C and C++, but there are interpreters that are just as bad (cough, PHP).  If we really can’t do without configuration settings, they should at least be standardized and located in the source code rather than something separate you have to know how to throw at each particular compiler or interpreter.

I’d like the same reasoning to apply to makefiles, too; rather than being something external, make them part of the language.  If some source only applies to certain platforms, that information can be in the source.

Java comes close in that all you have to tell the compiler is which source file to start with.  I’d like to get rid of even that, though, so that all you need to do is point the compiler or interpreter at the folder containing the source code, and let it take things from there.  I also don’t like that Java requires the source have a particular directory structure.  Ideally, the directory structure wouldn’t matter; heck, ideally you could append all of the source code files, in any order, into a single file and it would still have the same meaning.  Don’t know if that’s realistic.  I guess if a single build produces multiple applications, you’ll need to tell the interpreter which one to run – but if nothing else it could be a selection from a menu rather than a freeform guess.

I might prefer a family of languages, with certain properties in common and in particular the ability to mix them freely.  In particular a special language for database operations, probably not much like SQL – but the key idea here is that you can put the database statements right in the middle of a function written in a regular language, rather than having to build a string or mess about with parameterization functions.

I’m ambivalent about function overloading, but if I do have it, I want it to be very obvious to the programmer which function you’re calling at any given time.  In particular, if the types don’t match one of the overloads exactly, I want to have to convert them myself, not have the language guess what I meant.  Non-overloaded functions can still do automatic conversion where appropriate.

As for the cases where function overloading is all about converting types according to certain rules, e.g., x, y -> Point(x,y), it almost seems that what you want is for the function to describe how the compiler/interpreter should parse calls to it?  I’m not sure what that would look like, but it sounds as if it could be generalized into something very powerful.  (Perhaps too powerful.  Don’t want to get seduced by the Dark Side here.)  On the other hand, if someone can make something sensible out of this, the same mechanism might also be able to provide a much safer replacement for C macros.

On the third hand, if we stick to the original idea, it could be as simple as

void draw_point(p:Point or p:new Point(x:int, y:int), c:Color)


This one is implicit, I think, in some of the original post’s ideas, but it should be possible to include code in a program that runs at compile-time.  Static initializers would do it by default, but there should also be an explicit way of saying “this segment of code here?  run it now.”

Safety/Memory Management: I’m thinking by default you can’t get a pointer to a variable; only if has been declared in a way compatible with pointerhood.  Perhaps you have special container types that allow you to have pointers to objects inside them; the lifetime of the object, unless explicitly deleted, is the lifetime of the container.  Pointers to a deleted object automatically go null, or raise an exception if the pointer type doesn’t allow null.  Or something.  Maybe different container types have different rules – this one is reference counted, this one is explicit-delete only.  There should perhaps be an escape hatch.

Indentation: I’m not sure about getting rid of braces, perhaps because I’m old enough to think that sometimes I might have to type in code from a printed or handwritten page, image or the like, rather than always having access to the original source files and/or being able to copy and paste.  I would like the language to require that the braces and the indentation match up, though.  Perhaps that means an IDE could just not show you the braces if you don’t want to see them?

Also, there are situations where even reading the code might be awkward.  If you’re six or seven indentations deep, and then drop out of several at once, it might not be clear.  “Was that the end of three blocks or four?  Which of the blocks two pages up lines up?”

Classes: if you’re willing to require complete source code (or a source-equivalent, like bytecode) then instead of using inheritance to modify the behaviour of a type, you could have a language construct that says “hey, make me a new type with all the same code as the old type, except for these changes”.  Fragile if the upstream code changes, of course, but no more so than inheritance.  Plus, from the compiler’s view the two types are unrelated, making everything simpler.

Traits: similarly, you can eliminate dispatch complications if the compiler can generate as many copies of a function, with different types, as it decides it needs.

Exceptions: for the simplest cases, how about something like

value = some_dict[key] or default_value

… although that assumes there’s only one kind of (acceptable) failure, and the function knows what it is.  It also doesn’t deal with your example where you want to add the key if it doesn’t exist, though I suppose that might just be

value = some_dict[key] or (some_dict[key] = default_value)

so long as you’re happy about assignment being an operator.  That could also be implemented much more efficiently than a real exception, it’s just a hidden extra return value.

Operator precedence: except for a (relatively) few common, well-understood cases, just don’t have any precedence; require the programmer to use brackets.  That does mean the compiler would have to know which operators were associative, though; I don’t want to have to say (a+b)+c or a+(b+c) unless they’re actually different.  (But if they are different, then I do want the compiler to remind me of the fact by forcing me to include the parenthesis.)


Typically, when using threading for I/O, you don’t really need the “threads” to run simultaneously.  Give them a different name (fibers, perhaps, ala Windows) and have them run one at a time, switching between them only when they do a wait; that makes it a lot easier to reason about thread safety.

You’d still have to separate out the I/O code from the ordinary code, though.  Also, it would be nice if fibers didn’t need their own stacks, but that means the code has to be very flat; basically any time you want to call an async function the compiler has to be able to inline it.  That might be too restrictive, though I think it’s worth a try.

The “Python problem” can be solved by having both threads and fibers.  Fibers would be for I/O and similar tasks, and could share memory ownership; threads would be for concurrent processing, and no shared memory except via safe functions (and escape hatches, I guess).  Still doesn’t solve all the problems.

Standard library: some of these problems will go away if the compiler has source code for the standard library (apart from the native primitives, I guess) and has good support for trimming away unused code.  You still need some kind of versioning.

Optimization: speaking of trimming away unused code, that’s another case where functions need to be able to define their syntax and/or where some of the code needs to run at compile time.  So that if you’re using a function analogous to printf, it can work out at compile time that you aren’t using the floating-point bits and throw them away.  (Well, provided the format string is a constant, but that seems like a reasonable constraint.)

You could something like that along the lines of preprocessor directives, except that it’s per-invocation:

do_stuff(x:int, fred:option, greg:option):
  if option fred
    # do fred stuff
  if option greg
    # do greg stuff

and then if you ever call do_stuff(x, fred) the compiler includes the version of do_stuff with the fred option in the executable, and if you don’t it doesn’t.

That’s all I got.