You may want to look at Julia.
It's a modern, dynamically typed language with Lisp inspired meta programming.
It also JIT compiles using LLVM. The standard library is written in Julia too and is compiled and cached on first use I.e. A form of install time AOT compilation.
As the LLVM IR is statically typed, I think this has most of the features you were looking for although, as with many dynamic languages, REPL usage was an important consideration hence the JIT compilation.
One interesting paper I saw (section 5.2). The standard library has around 135k variables of which about 80k had a fixed, static type. The rest had to stay represented as the variant type Any.
Depending on why you're interested in dynamic to static, this may be of value when you look at how dynamic you make your language. Obviously, this has some bearing on performance although, in practice, Julia seems quite quick.
There's a number of inter-related questions here, I'll try to separate them as best I can.
Why do other languages build on LLVM IR and not clang AST?
This is simply because clang is a C/C++ front end and the AST it produces is tightly coupled to C/C++. Another language could use it but it would need near identical semantics to some subset of C/C++ which is very limiting. As you point out, parsing to an AST is fairly straightforward so restricting your semantic choices is unlikely to be worth the small saving.
However, if you're writing tooling for C/C++ e.g. static analysers, then re-using the AST makes a lot of sense as it's a lot easier to work with the AST than the raw text iff you're working with C/C++.
Why is LLVM IR the form it is?
LLVM IR was chosen as an appropriate form to write compiler optimisations. As such, it's primary feature is that it's in SSA form. It's quite a low level IR so that it is applicable to a wide range of languages e.g. it doesn't type memory as this varies a lot across languages.
Now, it happens to be the case that writing compiler optimisations is quite a specialist task and is often orthogonal to language feature design. However, having a compiled language run fast is a fairly general requirement. Also, the conversion from LLVM IR to ASM is fairly mechanical and not generally interesting to language designers either.
Therefore, lowering a language to LLVM IR gives a language designer a lot of "free stuff" that is very useful in practice leaving them to concentrate on the language itself.
Would a different IR be useful (OK, not asked but sort of implied)?
Absolutely! ASTs are quite good for certain transformations on the program structure but are very hard to use if you want to transform program flow. An SSA form is generally better. However, LLVM IR is very low level so a lot of the high level structure is lost (on purpose so it's more generally applicable). Having an IR between the AST and the low level IR can be beneficial here. Both Rust and Swift take this approach and have a high level IR between the two.
Best Answer
Yes, there is indeed. Sort of.
Newspeak has no static state and no global state. This means that the only possible way to get access to a dependency is to have it explicitly injected. Obviously, this means that the language, or in the case of Newspeak more precisely the IDE needs to make dependency injection easy, otherwise the language will be unusable.
So, the language is not designed for DI, rather the necessity for DI is a consequence of the language design.
If there is no static state and no global state, then you cannot just "reach out" in to the ether and pull something out. For example, in Java, the package structure is static state. I can just say
java.lang.String
and I have myself theString
class. That is not possible in Newspeak. Everything you work with, has to be explicitly provided to you, otherwise you just can't get at it. So, everything is a dependency, and every dependency is explicit.You want a string? Well, you have to first ask the
stdlib
object to hand you theString
class. Oh, but how do you get access to thestdlib
? Well, you have to first ask theplatform
to hand you thestdlib
object. Oh, but how do you get access to theplatform
? Well, you have to first ask someone else to hand you theplatform
object. Oh, but how do you get access to that someone lese? Well, you have to first ask yet another someone else to hand you the object.How far down the rabbit hole does this go? Where does the recursion stop? All the way, actually. It doesn't stop. Then, how can you write a program in Newspeak? Well, strictly speaking, you can't!
You need some outside entity that ties it all together. In Newspeak, that entity is the IDE. The IDE sees the whole program. It can wire the disparate pieces together. The standard pattern in Newspeak is that the central class of your application has an accessor called
platform
, and the Newspeak IDE injects an object into that accessor that has methods which return some of the basic necessities of programming: aString
class, aNumber
class, anArray
class, and so on.If you want to test your application, you can inject a
platform
object whoseFile
method returns a class with dummy methods. If you want to deploy your application to the cloud, you inject a platform whoseFile
class actually is backed by Amazon S3. Cross-platform GUIs work by injecting different GUI frameworks for different OSs. Newspeak even has an experimental Newspeak-to-ECMAScript compiler and HTML-backed GUI framework that allows you to port a fully-featured GUI application from native desktop into the browser with no changes, just by injecting different GUI elements.If you want to deploy your application, the IDE can serialize the application into an on-disk object. (Unlike its ancestor, Smalltalk, Newspeak has an out-of-image object serialization format. You don't have to take the entire image with you, precisely because all dependencies are injected: the IDE knows exactly which parts of the system your application uses and which it doesn't. So, it serializes exactly the connected subgraph of the object space that comprises your application, nothing more.)
All of this works simply by taking object-orientation to the extreme: everything is a virtual method call ("message send" in Smalltalk terminology, of which Newspeak is a descendant). Even the superclass lookup is a virtual method call! Take something like
or, in Newspeak:
In Java, this will create a name
Foo
in the static global namespace, and look upBar
in the static global namespace and makeBar
Foo
's superclass. Even in Ruby, which is much more dynamic, this will still create a static constant in the global namespace.In Newspeak, the equivalent declaration means: create a getter method named
Foo
and make it return a class that looks up its superclass by calling the method namedBar
. Note: this is not like Ruby, where you can put any executable Ruby code as the superclass declaration, but the code will only be executed once when the class is created and the return value of that code becomes the fixed superclass. No. The methodBar
is called for every single method lookup!This has some profound implications:
since an inner class is just a method call that returns a class, you can override that method in a subclass of the outer class, so every class is virtual. You get virtual classes for free:
Newspeak:
since the superclass is just a method call that returns a class, you can override that method in a subclass of the outer class, inner classes defined in the superclass can have a different superclass in the subclass. You get class hierarchy inheritance for free:
Newspeak:
and lastly, the most important for this discussion: since (apart from the ones you defined in your class, obviously) you can only call methods in your lexically enclosing class(es) and your superclass(es), a top-level outermost class cannot call any methods at all except the ones that are explicitly injected: a top-level class doesn't have an enclosing class whose methods it could call, and it cannot have a superclass other than the default one, because the superclass declaration is a method call, and it obviously can't go to the superclass (it is the superclass) and it also can't go to the lexically enclosing class, because there isn't any. What this means is the top-level classes are completely encapsulated, they can only access what they explicitly get injected, and they only get injected what they explicitly ask for. In other words: top-level classes are modules. You get an entire module system for free. In fact, to be more precise: top-level classes are module declarations, its instances are modules. So, you get a module system with parametric module declarations and first-class modules for free, something which many, even very sophisticated, module systems cannot do.
In order to make all of this injection painless, class declarations have an unusual structure: they consist of two declarations. One is the class constructor, which is not the constructor which constructs instances of the class, but rather the constructor that constructs the environment in which the class body runs. In a Java-like syntax, it would look something like this:
Newspeak:
Note that the way a Newspeak programmer is actually going to see the class(es) is like this:
I can't even begin to do it justice, though. You'll have to play around with it yourself. Gilad Bracha has given a couple of talks about various aspects of the system, including modularity. He gave a really long (2hr) talk, the first hour of which is a thorough introduction to the language, including the modularity story. Chapter 2 of The Newspeak Programming Platform covers modularity. If you skim Newspeak on Squeak – A Guide for the Perplexed (aka Newspeak-101), you get a feel for the system. Newspeak by Example is a live document (i.e. it is running inside the Newspeak-on-ECMASCript port, every line of code is editable, every result is inspectable) demonstrating the basic syntax.
But really, you have to play around with it. It is just so different from all mainstream and even most non-mainstream languages that it is hard to explain, it has to be experienced.