Andy Melnikov (nponeccop) wrote,
Andy Melnikov

HNC - the beginning

I was a C++ developer maintaining one software system for 8 years. It is a distributed
MPI system working with large datasets (~100GB-1TB). Needless to say, C++
was boring and competitor languages were distracting:

* Programs in Perl and Haskell took much less time to write
* Perl is so ugly, unsafe and slow. It is not even useful for IO-bound tasks. Python is beautiful
but not faster.
* Haskell runs much faster than Perl but still slower than C++ and is prone to memory leaks.
But even if I overcome leaks by mastering Haskell, it still doesn't support Windows x64.
* Java is still faster than Haskell but its memory safety and GC were not perceived as advantages,
and a huge runtime was seen as a clear deployment disadvantage.
* ATS was a way to go, but I feared its interaction with the existing huge C++ heap (6-16GBs
per node) and large densely packed datasets in general. Moreover, I couldn't compile ATS using
my development compiler - MSVC for Windows x64. And I was afraid of ABI issues between
* OCAML was the only compiler supporting Win x64 and MSVC at that time, but I was afraid of
its tagged integers and convoluted foreign code interface. And OCAML was perceived as ugly and

So I compiled a list of requirements for my next language:

* It should have a beautiful lexical syntax such as Python, Haskell or Javascript (basically '$;@%#' were forbidden characters)
* It should reduce parentheses, braces etc as much as possible: `f x y` instead of `f(x, y)`
and indents instead of {} are virtues, S expressions is a sin. Whitespace and latin are good, punctuation and greek is evil.
* It should be statically typed (I was tired of mistaken assignments of hashes to arrays of arrays) and
with a full type inference (so it looks like dynamic languages)
* It should be generic: at least basic parametric polymorphism (ML-style without polymorphic
recursion; even value restriction is not a problem) and control abstractions (e.g. HOF).
* Pattern matching or algebraic data types or memory safety or referential transparency or
concurrency are not requirements.
* It should have a fast and textually concise interface with C code. An ability for a direct interaction
with C++ is a plus.
* It should work fine with 90% filled heaps and most data never dying. A good
generative GC could probably work.
* It should have an efficient data representation: if my C structures explode 2x in size
after porting, it's a disaster because RAM is already maxed out on nodes and we cannot
afford doubling the node count both because of software architecture and funding.
* It should be at least as fast as Haskell and it should remain fast on huge datasets (see
GC performance with filled heaps and inefficient data packing problems above).
* It should support Windows x64 and interface with MSVC x64.

F# and Nemerle could probably work. What else? Scala?
Tags: fp, hn0, programming

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.