I too have been looking at Julia ever since Doug Bates sent me a heads-up in January. But like @gsk3, I measure this on an "Rcpp scale" as I would like to pass rich R objects to Julia. And that does not seem to be supported at all right now.
Julia has a nice and simple C interface. So that gets us something like .C()
. But as recently discussed on r-devel, you really do not want .C()
, in most cases you rather want .Call()
in order to pass actual SEXP variables representing real R objects. So right now I see little scope for Julia from R because of this limitation.
Maybe an indirect interface using tcp/ip to Rserve could be a first start before Julia matures a little and we get a proper C++ interface. Or we use something based on Rcpp to get from from R to C++ before we enter an intermediate layer [which someone would have to write] from which we data feed to Julia, just like the actual R API only offers a C layer. I don't know.
And the end of the day, some patience may be needed. I started to look at R around 1996 or 1997 when Fritz Leisch made the first announcements on the comp.os.linux.announce newsgroup. And R had rather limited facilities then (but the full promise of the S language, of course, si we knew we had a winner). And a few years later I was ready to make it my primary modeling language. At that time CRAN had still way less than 100 packages...
Julia may well get there. But for now I suspect many of us will get work done in R, and have just a few curious glimpses at Julia.
Keno's answer is spot on, but maybe I can give a little more detail on what's going on and what we're planning to do about it.
Currently there is only an LLVM JIT mode:
- There's a very trivial interpreter for some simple top-level statements.
- All other code is jitted into machine code before execution. The code is aggressively specialized using the run-time types of the values that the code is being applied to, propagated through the program using dynamic type inference.
This is how Julia gets good performance even when code is written without type annotations: if you call f(1)
you get code specialized for Int64
— the type of 1
on 64-bit systems; if you call f(1.0)
you get a newly jitted version that is specialized for Float64
— the type of 1.0
on all systems. Since each compiled version of the function knows what types it will be getting, it can run at C-like speed. You can sabotage this by writing and using "type-unstable" functions whose return type depends on run-time data, rather than just types, but we've taken great care not to do that in designing the core language and standard library.
Most of Julia is written in itself, then parsed, type-inferred and jitted, so bootstrapping the entire system from scratch takes some 15-20 seconds. To make it faster, we have a staged system where we parse, type-infer, and then cache a serialized version of the type-inferred AST in the file sys.ji
. This file is then loaded and used to run the system when you run julia
. No LLVM code or machine code is cached in sys.ji
, however, so all the LLVM jitting still needs to be done every time julia
starts up, which therefore takes about 2 seconds.
This 2-second startup delay is quite annoying and we have a plan for fixing it. The basic plan is to be able to compile whole Julia programs to binaries: either executables that can be run or .so
/.dylib
shared libraries that can be called from other programs as though they were simply shared C libraries. The startup time for a binary will be like any other C program, so the 2-second startup delay will vanish.
Addendum 1: Since November 2013, the development version of Julia no longer has a 2-second startup delay since it precompiles the standard library as binary code. The startup time is still 10x slower than Python and Ruby, so there's room for improvement, but it's pretty fast. The next step will be to allow precompilation of packages and scripts so that those can startup just as fast as Julia itself already does.
Addendum 2: Since June 2015, the development version of Julia precompiles many packages automatically, allowing them to load quickly. The next step is static compilation of entire Julia programs.
Best Answer
Symbols in Julia are the same as in Lisp, Scheme or Ruby. However, the answers to those related questions are not really satisfactory, in my opinion. If you read those answers, it seems that the reason a symbol is different than a string is that strings are mutable while symbols are immutable, and symbols are also "interned" – whatever that means. Strings do happen to be mutable in Ruby and Lisp, but they aren't in Julia, and that difference is actually a red herring. The fact that symbols are interned – i.e. hashed by the language implementation for fast equality comparisons – is also an irrelevant implementation detail. You could have an implementation that doesn't intern symbols and the language would be exactly the same.
So what is a symbol, really? The answer lies in something that Julia and Lisp have in common – the ability to represent the language's code as a data structure in the language itself. Some people call this "homoiconicity" (Wikipedia), but others don't seem to think that alone is sufficient for a language to be homoiconic. But the terminology doesn't really matter. The point is that when a language can represent its own code, it needs a way to represent things like assignments, function calls, things that can be written as literal values, etc. It also needs a way to represent its own variables. I.e., you need a way to represent – as data – the
foo
on the left hand side of this:Now we're getting to the heart of the matter: the difference between a symbol and a string is the difference between
foo
on the left hand side of that comparison and"foo"
on the right hand side. On the left,foo
is an identifier and it evaluates to the value bound to the variablefoo
in the current scope. On the right,"foo"
is a string literal and it evaluates to the string value "foo". A symbol in both Lisp and Julia is how you represent a variable as data. A string just represents itself. You can see the difference by applyingeval
to them:What the symbol
:foo
evaluates to depends on what – if anything – the variablefoo
is bound to, whereas"foo"
always just evaluates to "foo". If you want to construct expressions in Julia that use variables, then you're using symbols (whether you know it or not). For example:What that dumped out stuff shows, among other things, is that there's a
:foo
symbol object inside of the expression object you get by quoting the codefoo = "bar"
. Here's another example, constructing an expression with the symbol:foo
stored in the variablesym
:If you try to do this when
sym
is bound to the string"foo"
, it won't work:It's pretty clear to see why this won't work – if you tried to assign
"foo" = "bar"
by hand, it also won't work.This is the essence of a symbol: a symbol is used to represent a variable in metaprogramming. Once you have symbols as a data type, of course, it becomes tempting to use them for other things, like as hash keys. But that's an incidental, opportunistic usage of a data type that has another primary purpose.
Note that I stopped talking about Ruby a while back. That's because Ruby isn't homoiconic: Ruby doesn't represent its expressions as Ruby objects. So Ruby's symbol type is kind of a vestigial organ – a leftover adaptation, inherited from Lisp, but no longer used for its original purpose. Ruby symbols have been co-opted for other purposes – as hash keys, to pull methods out of method tables – but symbols in Ruby are not used to represent variables.
As to why symbols are used in DataFrames rather than strings, it's because it's a common pattern in DataFrames to bind column values to variables inside of user-provided expressions. So it's natural for column names to be symbols, since symbols are exactly what you use to represent variables as data. Currently, you have to write
df[:foo]
to access thefoo
column, but in the future, you may be able to access it asdf.foo
instead. When that becomes possible, only columns whose names are valid identifiers will be accessible with this convenient syntax.See also: