Do We Create Type Systems In Dynamic Languages?

I was reading the article “Twitter on Scala” that appeared on artima a few days ago. I was really interested to hear what they had to say about it, since I have been looking at Scala a bit over the past few months. I was also curious to hear what exactly they had to say about Ruby and their reasons for wanting to move over to Scala (at least for parts of Twitter).

I had expected to hear the usual complaints about Ruby, which generally revolves around performance and the runtime, but if that was their only problem then why didn’t they just move to running JRuby on the JVM? They would have gotten some of the perf they wanted, along with the benefit of having a more mature garbage collector. And don’t get me wrong, those complaints were there, but there was another interesting comment made by Alex Payne.

As our system has grown, a lot of the logic in our Ruby system sort of replicates a type system, either in our unit tests or as validations on models. I think it may just be a property of large systems in dynamic languages, that eventually you end up rewriting your own type system, and you sort of do it badly. You’re checking for null values all over the place. There’s lots of calls to Ruby’s kind_of? method, which asks, “Is this a kind of User object? Because that’s what we’re expecting. If we don’t get that, this is going to explode.” It is a shame to have to write all that when there is a solution that has existed in the world of programming languages for decades now.

Before people start crying foul about the “kind_of?” checks in Ruby, yes, we all know that this is not the “Ruby Way”. We are all fully aware that if you are doing “kind_of?” checks all over the place then you are doing it wrong. But what is interesting to me in this comment is talking about creating a type system in unit tests. I had honestly never really thought about it in that way, and so when I read that comment I had one of those long “hmmmmmmmmmmmmmmm” moments.

We all know that unit testing is absolutely important for any program, but testing takes on a new level of importance in dynamic languages because many times we are doing a lot of the checking that happens at compile time in statically typed languages. But are we really trying to recreate a type system?

I know that even in dynamic languages, when I write a method I often have an idea of what type will eventually be passed into the method, even if I am doing no type checking. Maybe I am still living in a static mindset, or maybe this is typical among developers in dynamic languages? I love the power that dynamic languages provide me, but I also realize that a good portion of my code would look almost identical in a statically typed language. Especially in a statically typed language such as Scala or F# that have almost magical type inference.

But anyways, I want to hear some feedback, and here are a few questions to get you started.

When we write unit tests in dynamic languages, are we doing some of the work that statically typed language compilers do for us? I would argue yes. And maybe it is just because of my static background, but I would pass common types that I expect to be given to methods, in order to check correctness. In a statically typed language I would do the same thing, but I wouldn’t be checking for things like missing methods, misspelled methods/variables, etc…

If we are, then aren’t we doing it much less reliably? Again, I would say yes. With a statically typed language compiler, you really have very little chance of missing basic type errors. The bar is much higher in dynamic languages.

If this is the case, then would it be a better balance to have a static language that would let us fall into dynamic typing when we need it? This is a much tougher question. What advantages of dynamic typing would we lose when we have to think about using it? What would we lose in the interaction?

And finally, are we trying to recreate a type system? I’m not sure I’d go so far as to say that we are trying to recreate a type system, but I think that we are often thinking of concrete types as we develop our code. And when we test, I think that this is manifest. I want to hear your thoughts on this though.

So leave a comment (or better yet, respond in your own post), what do you think?

Be Sociable, Share!

22 comments

  1. I would say "Yes" to (1) and (2). Testing does require a knowledge of the type because quite often, the application logic depends on it.

    To some extent, dynamic languages have a scaling problem. The greater your code base, the more you need to worry about type safety and the more actual (and quite silly) problems you will encounter because of them.

    Having a static language that lets you fall into dynamic typing would be a good choice. It depends, I suppose, on the dynamic features in the language and how someone uses it. I mean, if someone is going to use the dynamic stuff by default and then re-introduces typing, I am not sure how it helps except waste a lot of time.

  2. @Krishna What I am wondering though, is whether or not this is the case across the board.

    Obviously you enjoy static typing, and believe that the costs of dynamic typing outweigh the benefits, but I’d love to hear from a developer working on a large scale dynamic language project. Maybe there just aren’t that many of them, and probably even less that read my blog. :-)

  3. Justin,

    Very nice observation.

    I know C is about as far away from {Ruby|Python|etc} as you can get, but one nice thing about Objective-C, in particular, is that you’ve got strong typing when you want it, but you can always use the (id) type, which essentially is just a prettier way to say its a pointer to "some object".

    My dream language would be Objective-C with a bit more Ruby-ish syntactic sugar and an interpreter so you could interactively program. Seems like the best of both worlds.

  4. Hi,

    Nowadays I work with Ruby most of the time, and I don’t want to go back to the static world. The power and flexibility the dynamic languages give us is awesome, the code gets so much simpler and concise, it’s just much easier to maintain it.

    I’d answer No to questions 1 and 4 and Yes to question 2 and 3. Tests in dynamic languages should check the behavior of things, not it’s type. If you do check "kind_of?" every time it means you’re not working as you should and you’re doing things the wrong way. This is pointed out by Obie (http://blog.obiefernandez.com/content/2009/04/my-reasoned-response-about-scala-at-twitter.html) as well.

    Question 2, if you’re doing it, working with types in a dynamic world, I’m sure you’re making a terrible job doing type checking, for on simple reason: you’re human. Type check is a repetitive, mechanical task and should be left to computers.

    Question 3, having a optionally typed language would be very very nice, we could have all the speed from static typing coupled with the flexibility of dynamic typing, just fantastic. Dave Thomas, the man behind the pickaxe book, mentioned it in his keynote at RubyConf 2008 (http://pragdave.blogs.pragprog.com/pragdave/2008/12/forking-rubymy-rubyconf-keynote-is-now-up.html) and Charles Nutter has something like that called Duby (http://blog.headius.com/2008/08/duby-update.html), and I could argue that JRuby and IronRuby bring this kind of power to the table.

    Question 4, as I said before, if you’re static typing your dynamic code, you’re doing things the wrong way, blaming the language is just lame.

    Saying that "dynamic languages have scalability problems" is blaming the wrong thing. Architecture plays a much larger role in scaling than a language, the whole "Twitter doesn’t scale" is just FUD, they start scaling not after using Scalla, but after fixing their infrastructure and architecture. They would have the same problems if they were using any other language, no matter how fast it ran.

    At least part of the Ruby community is really disappointed with Alex Payne, not because he is saying bad things about Ruby, but because he is doing so for the wrong reasons and personal gain (his Scala book). That’s sad, but that’s life, Ruby won’t go anywhere just because people don’t use it properly.

    Cheers

  5. @Rafael Thank you for taking the time to post such a detailed response, it is good to hear from a person who is working in dynamic languages on a day to day basis.

    I mostly agree with your responses, but I’m curious though when you answered "no" to number 1 you talk about using "kind_of?", but what about non-explicit type checking? When you write a method, even if you are just passing messages to an object, don’t you have a specific type of object in mind when you write it? Aren’t there assumptions that are made when designing classes like this?

    Also, I was arguing too that simple syntax errors can be caught by unit tests in dynamic languages much more often than they are in statically typed languages. Would you argue this?

    I have no doubt that dynamic languages are extremely powerful, I am a huge fan of Ruby, I’m just trying to wrap my mind around developing a large codebase in a dynamic language, and what its implications would be.

  6. I think it’s just wrong to test for the type of the variable, unless it’s really needed. Such checking detracts from what you are trying to accomplish and from relying on the exception backtrace to give you a clue of what went wrong when an error occurred. Also relevant is that if you are trying to pass in a new object that should work despite not being a direct descendent of a certain object, by checking for a certain type you restrict the usefulness of the API. Even "mocking" a certain object can be more troublesome then.

    That said, whereas many Rubyists can make more use of respond_to?(), I can make a few uses of is_a?() every now and then. They usually make use of more idiomatic Ruby that way than me.

    On the Spec/Test side, I hope everyone is testing more functionality than types as I am sure the type checking is just redundant in Ruby. Speaking of redundancies, a famous motto is all you need to follow to keep it cool: "don’t repeat yourself".

  7. Okay guys, yes we all know that doing "kind_of?" checks in Ruby is not the way to go. I even said that in the post. I am getting at a more subtle distinction of, are we doing some of the static compiler work in a dynamic languages through our unit tests?

    Yes, explicit type checking in a dynamic language is bad. We all agree on that. :-)

  8. Either you use kind_of? or you take the risk of breaking with an UnknownMethod exception which will 1) leave the application in an inconsistent state and 2) probably crash and confuse the user.

    You have to use kind_of? in both your tests and your business logic if you want to write robust code in a dynamically typed language.

  9. Mh, where to begin?
    I am working on programs in python about every day (partly employed, partly hobby-programming student) and some of them have grown kind of large (last time I checked, the largest grew to 4k – 5k lines in python. That is kinda largish for python code, in fact).
    I have to answer your question with yes. Yes, I always have some kind of interface attached to a certain parameter whenever I look at it (at least in my mind). I pretty much know what methods are required on this parameter so I can use it properly. (Thought from today: "This input needs to have a next() method, and the mark/commit/rollback-complex. This element needs to have a parse-method with the usual parse-contract.").

    However, I am not sure if it is correct to say: languages with runtime-typing converge to languages with static typing eventually. I much rather think that there are certain universal principles if you build object oriented systems. In fact, I think you can go even further: there are certain universal principles if you build modularized systems (classes are just a way to modularize). For example, interfaces with contracts attached to them are a very, very universal concept. They work for classes in (Python|Java|C#|Ruby), they work for modules (functional languages, C), they work for components. They can be applied to pretty much everything. Static typesystems just try to ensure a certain part of this contract (mostly they guarantee the existence of a certain set of methods to call). Dynamic typesystems offer no such luxury. However, both typesystems still need a contract-checker. If ‘a’ and 5 goes in, will True come out?
    However, this leads us to a surprising conclusion: If every method is used somewhere in a contract in your interfaces (and if it is not, why do you keep it around? it does nothing!), then you need to write tests for all your methods anyway in order to check if they fulfill the interface contract! And given this, there will be a certain redundancy with static typesystems. If you call each and every method in your tests at some time, then you don’t need the compiler to look if all methods exist, because your test-suite checks if all relevant methods exist. (Of course, this sort of breaks down once you venture into the harsh land of untestability. However, usually it is possible to keep this land very small ;) ).

    Thus, I think we have a coincidence here. Yes, a good testsuite might end up looking like typechecking code, looking for methods, invoking methods, comparing results with some object and maybe even some classes, BUT this is not because I want a typesystem. It is a coincidence, because a static typesystem does a certain subset of the responsibility of a good testsuite, but no causality (Oh, static typesystems have this, lets have that, too).

    —–

    On the other hand, about scaling programs in a dynamic language. The major problem behind scaling programs in a dynamic language is freedom. A dynamic typesystem gives you a lot of freedom to just do whatever you need to do. Pass in a different parameter type? Well, do it. I don’t care!
    However, this freedom comes with the requirement of discipline. In python, for example, you can perform very load-time and runtime-manipulations (think metaclasses (things generating classes) and linearizing (lisp-like) multiple inheritance and stackframe manipulations). If you do this, you can create a maintenance nightmare in which a function like "def x(): y() + 3" calculates the prime number #1823 and prints it. Of course, this is certainly not a good thing.
    This requirement for discipline kind of implies that building good programs in a dynamic language is a bit harder than building it in a statically typed language (especially because you can write down some information about the types of your variables in the program and have the compiler take a look at them).

    So.. hope to help,
    Tetha

  10. most definitely yes. having maintained around 700K lines of tcl, I watched over the years as our code base gravitated to 10 or so major "types" with very strong expectations.

    the reason it happens is because developers need a way to make sense of that much code. the mind wants to find a "standard way" and types fit that need.

    i had always praised tcl’s absolute "do whatever you want its all a string" philosophy, but when i had the epiphany i was basically doing the work of a compiler, i definitely realized it was time to grow out of the scripting world. those curious should check tcl out; its procedural, but so insanely flexible such that you can easily make your own inheritance scheme.

    around that time the cto got the go ahead for a port to java, and i welcomed it and haven’t looked back since. these days its all java/c# (for extra learning and entertainment software) and javascript since js is totally unavoidable, but always fun :)

  11. What you are building are DSL’s.

    Much of the discussion I’ve seen here (except for a couple of posts) convey a viewpoint that comes from developers that cut thier teeth in statically typed languages. I cut my teeth on Smalltalk, so I see things alot differently even though I’ve been developing with (mostly) statically typed langauages for 12 years now (c++, java, c#).

    It sounds like alot of the things that are being checked in the unit tests are just unnecessary. As implied by Rafael et. al., if your doing all these checks, you probably are not on the proper tack to reach the desired destination.

    The sense is that people don’t think they are getting as much "bang for the buck" with Ruby – just maybe they should consider that the opposite is true, and should stop trying to make up the perceived loss manually.

    "Static typing catches the errors static typing creates."
    –Thomas Gagné

  12. Teha: nice post.

    In my experience, test suites in dynamically typed languages still fall short of what static types do because they usually don’t test what happens when your method receives an object that doesn’t respond to the method you expect.

    Very often, I find that people who say that kind_of? or respond_to? is a code smell simply ignore this problem, which means their code is not robust.

    And once you realize that and you start writing tests to verify cases that fail, you are in effect reinventing your own type system.


    Cedric

  13. @Ivan, @Todd, and @Cedric It is very interesting to hear this argument from both sides. Unfortunately I don’t think it is a question that we are going to resolve anytime soon. The argument however is extremely interesting. Thank you for your comments!

  14. Hi,

    Sorry I didn’t reply earlier, but I wasn’t notified of new comments or forgot to check the notifications check-box :)

    The "no" answer for the first question is because, if you’re using a dynamic language with the same mindset of a static typed one, you’re on the wrong boat. The key thing about the messages, as you commented, is not expecting a type, but a behavior. I don’t care what’s the object’s type, I just want it to respond to the message, but checking beforehand to see if the object responds to the message is not necessary, and I’ll explain why.

    When you call a method and pass a object, in a static language, if you try to pass a invalid class or interface you’ll get an error right? What do you do? You change the parameter for the right one. Simple as that, the need for the right object is dead at that moment. When you’re working with a dynamic language the problem is the same, if you pass an invalid object the code won’t work, and we hope you have enough tests to make sure this won’t happen. Just that.

    There are times when checking for types is necessary, but mostly when we’re dealing with metaprogramming. Bussiness rules code probably won’t have much metaprogramming involved, specially when you have a very specific task to accomplish, like Twitter. So, why check types? The developer will always be responsable to call valid code, pass valid classes, the compiler check just helps as do that easier, in a way that’s less dependent on our (often not so good) memory.

    You’re right that it’s easy to get the simple syntax errors using tests, and a static compiler does a much better job at doing that, since type checking is mechanical. The good part of having unit tests and the likes is that we can get logic errors with much less effort. I don’t believe we disagree on that.

    A while ago I was talking to a ALT.NET colleague in Paris and he was kind of skeptical about dynamic languages, and brought the large code base argument to the table. My answer to that was that large codebases are hard to maintain because they get to complex, and not because the typing system they use. I had nightmares trying to maintain .NET apps with IOC, 4 layers, etc, even though they were pretty small, for the plumbing we need to set up in the code is already a big source of complexity. Using dynamic languages and frameworks, like Ruby and Rails, we can write less code, make it more readable, and better tested (assuming you’re doing it the Ruby Way), making maintenance easier.

    That’s what I think, at least :)

    Cheers

    P.S.: I’ll mark the check-box this time :)

  15. @Rafael The functionality to send the comment notification e-mails has a bug in it with BlogEngine.net. I think that I have it fixed now. Let me know if you get the notification of this comment.

    Thank you for the well thought out comments. Unfortunately I don’t have any experience with dynamic languages in large code bases, and so I was more interested in the discussion than what I could add to it. I think the comments on this post have been particularly interesting.

    I can see where you are coming from with your comments about behavior, I think that from my perspective it is hard for me to divorce the behavior from the implementation. Of course that is also coming from a decade of static typing flowing through my veins!

  16. Actually, I think I have the post notifications fixed now. Sorry for the multiple comment notifications.

  17. The notifications are working now, thanks :)

    I really doubt that the guy who wrote the Pinderkent article above ever worked with dynamic languages himself. Automated testing is essential to dynamic languages, yes, but they should be essential to the static ones too. It’s a matter of good development practices not fixing the languages shortcomings.

    Cheers

  18. Great observations, Justin. I’ve had some related thoughts stewing around for a while and this is a good impetus to get them down. Will try to get those down soon-ish.

    Incidentally, though, I notice a number of people wishing that they could have the best of both worlds: a powerful static typing mechanism to catch obvious type errors, with opt-in dynamic typing where it makes sense. The real world, after all, is messy, particularly when we interact with the web. Casting back and forth is awkward and error-prone, and reveals the stringent limits on your flexibility as the price you pay for using a static type system.

    Fortunately, there’s a bit of light at the end of the tunnel in the .NET space, for this is precisely what C# 4 is trying to offer — check out the new dynamic keyword that it introduces if you haven’t played around with the CTP already. (And, if I may insert a shameless plug for those in the C’ville/Richmond area, you can find out more about what’s coming up in C# 4 at a presentation I’m giving next Thursday — see http://chodotnet.com for more details.)

    On the JVM side of things, there’s Groovy, which is a similar best-of-both-worlds typing system: static typing in a dynamic language. (C# 4 has "more" static typing this regard, since you can still get compile-time checks, whereas there’s not a comparable feature in Groovy.) However, I think people are still getting comfortable with figuring out when it is and isn’t appropriate to use static typing in a dynamic language, so I don’t think this feature’s being used to its fullest effect just yet.

  19. We are working on a decent sized ruby on rails project here. It’s 2 1/2 years old and has got 12 production releases. There are 357 .rb files under app directory. There are only about 60 calls to is_a?/kind_of/respond_to?, and if you look closely, some checks can be removed without sacrificing code quality. Based on that observation and my experience on this project, I think type checking is not one of the major pain points. IMO, this might be due to the fact this project is an Application, not a framework or library, and it is the same group of people who write the models and controllers. If you are working on a framework, you probably need to do more type checking – this is just my guess though.

    Just to share what I feel are the major pain points on this project, and prove that type checking is not a big deal on a typical CRUD project like ours, here is the list

    Our project started on rails 1.2.x (or maybe older version). Because it is not as mature and also we have some special requirements, we have to hack into rails and database drivers. It becomes very hard to keep up with latest version of rails.

    Deployment is RPM based. I wouldn’t say it is bad but it is not how a typical rails project is deployed.

    Database migration is done very differently and painfully. Refactoring database schema causes a lot of headache and is avoided when possible.

    Because there are a lot of different people/consultants worked on this project, code and tests show some inconsistencies and it is not a easy task to change.

  20. "Do we create type systems in dynamic languages?"

    No, you emulate them. Type systems don’t let errors through.

  21. I’m someone working on a large JRuby project, thought I would throw my perspective into the mix.

    I *do* definitely miss a static type system, for a few reasons, not all of them necessarily the ones you’d expect.

    * When it comes to data, and attendant issues like serialization, validation, schemas, persistence mapping etc, and the toolchain surrounding these issues. I find having static metadata about the types of your objects helps a lot. Especially anything where you need to reason statically about the *structure* of the data that you’re shuffling around — and as you start to scale up, being able to reason statically about dataflow is useful.

    In ruby I find myself and many others having to reinvent ways of expressing structure of data statically. ORM libraries, validation libraries, libraries for binding to different serialization formats and APIs, each has their own informal ad-hoc mechanism for describing the structure of data to some extent.

    I find it interesting to note that having a separate compilation phase or static type-checking isn’t a necessity to do this stuff well. Having some kind of agreed-upon, sufficiently-powerful type *system*, is.

    * A typed languages forces you to think about types. This makes me think harder about design issues in a large codebase — responsibilities of objects, interface boundaries, the way things compose — and to state this design more explicitly

    * With big codebases you do need more discipline. And this isn’t just, as people sometimes characterise it, an issue of inexperienced programmers needing to be reigned in. Unit tests work for this too of course, but for the kinds of discipline which type systems are good at enforcing (and where that discipline is useful — it isn’t always), I think they are better tools than unit tests.

    * Yeah, I might have to insert a few more runtime type checks than I otherwise would. But I haven’t found this to be as big an issue as you would have thought.

  22. In my opinion “duck typing” is the most natural type system for dynamic languages.

    Library calls and unit tests can be wrapped in try catch blocks with failures printing the details of the type issue. No need for isa’s.

Leave a comment