André Staltz

Guidelines for a new programming language

This is a transcript of a talk I gave at CycleConf 2017 about a potential future for Cycle.js where we would explore building a functional and visual programming language for dataflow. It is part 2 of the talk “Past, Present and Future of Cycle.js”.

INTRODUCTION

The previous talk was basically “the future of Cycle.js”, different types of things we could do. One idea was that we could build a language. Why? Because when Nick was talking, for instance, about Bonzai editor, he also had that in ASCII, diagrams in ASCII. That’s basically a language already, to some extent.

And then we just started thinking that JavaScript might not be the best host language for this, because sometimes people do like “map the stream, put a console.log” and if you do that with arrays, it works, because that runs immediately, but with streams it doesn’t. People get these problems. And all of that is because we don’t have that good of a support in the language. (These are my slides, I’m sorry. The slides are my notes, so you can follow everything.)

When we went to TypeScript, things were a bit better (of course you can still use JavaScript), and it’s not perfect. Then we were thinking: what would be perfect? And I do have some ideas of building a programming language, I was a lot inspired by Elm. What I thought first was: why do we need programming languages? Of course we want to get software done, but what is our goal, that’s the question, and how to get to that goal?

Usually some languages have design principles or guiding principles. Python has PEP 20, the Zen of Python, and this is like the Ten Commandments of Python, such as “there should be only one obvious way of doing stuff” and etc. And that sort of guides them if they’re going to make new features in Python, they ask “does that follow the Zen of Python”? JavaScript doesn’t have that, so you have a ton of inconsistency, committee design, and a bunch of different companies trying to push a bunch of stuff into JavaScript, and it’s only going to get bigger and more full of features. So it will get harder and harder to have a more focused experience when using JavaScript, because it will be super generic.

So I started asking myself: what is programming? What is code? What is our goal and how can we get there? Do we need a programming language to achieve our goal? Basically this talk is: if I would make a language, it would be guided by some guiding principles. That’s the most important thing to have in the beginning, because once you start building everything, you want to check if that language is following these principles. I don’t know if I’m going to build it, might be too much work, but I’m just putting out that these are the kind of principles that I would put in a language. Then, this will probably lift up some discussions about the nature of some things in programming.

ASSUMPTIONS ABOUT CODE

These are things that I assume are true in programming, or of code. They may not be true, but I believe that they are probably true.

First of all, code is a description of a very complex system. We have our programs, they are the system, integrated with all kinds of stuff. And code is the description of that. So typically code is declarative by nature, because you are declaring the system, no matter if you are declaring that with imperative style or with other styles, the idea of any code is to declare your system. If you want to answer “where is the conference?”, you can say the address (that’s declarative) or you can say “go forward, then left, go right”, that type of stuff. It’s one assumption I had about what is code.

I then I also thought that some people really like code as art, and that’s okay. But I assume that society values code more to deliver features than to deliver art. I assume that people think of code in that sense. Of course, code can still be valuable as art, but I assume that society thinks that code is more valuable to delivers features. So it’s basically features. Society doesn’t really understand all of the other issues like “it’s written in nice code”, and they are like “so what?”. They don’t tolerate slow performance and bugs, and that’s just something that they assume. They won’t ask to build a feature and make it fast, they will just assume that it will be fast once they ask for that feature. So code is a necessary evil, because they just want features, they want a system that delivers those features. And it happens to be that the way that we have right now is to describe this complex system through text.

I assume that code is more often read than written. If you take the whole programming activity and you measure time, it’s probably the case that you’re reading code more than writing it.

And then one that I discovered was the idea of valid and invalid states in programming. You cannot be vague when programming, there’s always the idea of “this is invalid”, “that is invalid”. So you always have these very strict rules in computers, and there isn’t that much way of being vague. This is actually something that we struggle with because of dynamic programming languages. In no programming language can you say “give me anything, and I’ll do anything”. Even in JavaScript (which is highly dynamic) you can’t usually give anything to any function. If you do, you’re going to get out invalid stuff like NaN, undefined, or whatnot. What strongly typed languages do is that they bring these restrictions forward, they make it very obvious that there are these invalid states. Dynamic languages, what they try to do, is hide them and pretend that you don’t have those restrictions. Now, this is a good thing because we, as humans, don’t like to work with very strict people, like very pedantic people. If I tell you:

So I understand why people want dynamic languages, it is very understandable. But we have this world of computers which are super strict, and we have the world of people, and how to make that bridge is hard. That’s why people prefer dynamic languages, because they get the feeling that they don’t need to be working with a pedantic system.

I really want to show this comic, Java versus JavaScript, this is the best:

http://i.imgur.com/76Wtthy.jpg

You may believe that you’re talking to someone that just understands you, but in fact, they are so weird. What dynamic programming does is that it delays these invalid states. You cannot hide the invalid states in programming, so what you’re doing with dynamic programming languages is that you’re just being vague when describing the system, but those invalid states will still be there, and you’re not handling them. What will happen is that those invalid states will show up in the latest stages, which are with the user. Using your program, it crashes. A bug report from a user is a compile error that you didn’t get while writing code. So it is kind of like a lie to believe in dynamic programming languages, because you will get this bug in the latest stages, and that’s embarrassing and also bad for business. So what dynamic programming gives you is short term benefits of not having to deal with these nitpicking cases, but in the end, the user will tell you these nitpicking things. I’m not talking so much about dynamic versus typed and saying that “typed is the future”, what I’m saying is we should bring these invalid states closer to the point where you’re writing the program. It could be through types, it could be through anything else. The point is that we are not delaying this to the latest stage, which is embarrassing.

(And here is a big parenthesis. Since we like to be vague as human beings and Artificial Intelligence is also catching up, here is a multimillion-dollar opportunity: to have a programming language where you just tell a statement, like “put stuff on the screen”, and the AI is going to figure out what does that mean. And it’s going to have no syntax, because you have no strict way of explaining the system, you just explain it however you want, and the AI will figure out. So it could turn out that you just write whatever things that you want as if you would be a manager writing an email to a programmer, and the AI would just figure it out. Maybe it would tell you “can you give me a little bit more details, because I am struggling here”, that type of stuff. This could probably remove all our jobs in the future, but maybe not so soon. Maybe the business people wouldn’t use this, but probably they would pass this to some designer, which is a hybrid between a designer and a developer, and they would describe the system. But let’s go back to languages.)

Let’s talk a little bit about boilerplate and implicit stuff. In Cycle.js we say “this is an explicit framework”, but what does that even mean? It’s kind of hard. It’s all about assumptions. So for instance, Cycle.js is not entirely explicit because we don’t show the assembler code that is running for everything. So we assume some stuff to be rather obvious, and then we build on that. But there is a spectrum. The more you assume that the programmer reading code knows, the closer you are getting to “magic”. It’s those cases where you don’t need to import anything, you don’t need to do anything else than put some code somewhere, and everything magically works. That is basically assuming that you know that there is this system in place.

Experienced programmers usually prefer a bit more magical stuff, because they already have all of those assumptions embedded in them. For instance, if you’re tired of importing all of those packages, you could just not import, and just write your code. You hide assumptions, because you know that you would import them anyway, so you don’t need to type them. You just made it more magical. So then you can focus on signal instead of noise. You can focus on those things that deliver features, and type less code. Experienced programmers like to type less, because it’s so tedious to write stuff. But, magic is anti-beginners, because it is assuming too much from the reader. If you’re a beginner, you don’t know that much, you don’t have those assumptions embedded built into you. And then you might read code and be very confused. It might look like a very complex system when in fact it could have looked simpler by just showing those assumptions. So it’s very hard to go fully explicit because then it means we’re going to show Assembly code. But what we can do is think “what does the average beginner look like, and what are the things that they know about programming”, and let’s use those things as assumptions.

ASSUMPTIONS ABOUT PROGRAMMING

About programming, I started noticing that programming to generate code is an activity that we do that has two modes, basically. In one mode, you are a student (reading). And in the other mode, you are a puzzle solver (writing).

In the student mode, you are putting the system into your head, trying to understand that system. This happens a lot when you are reading code, trying to learn the language, trying to learn the libraries. And the student mode happens always, not just for learning libraries, but when you join a new project, you have to learn how does that new project work. When you look at a pull request, you have to learn how does that pull request affect your system. So it’s always about learning. It’s a mode that is always there. It’s about reading, and sinking in that stuff. Even if you have the same libraries and same programming languages forever, the system (the runtime) still will need to be learned.

The other mode is, we might think is building stuff, but it’s mostly about solving puzzles that kind of look like building. The best analogy that I found was Tetris, because in Tetris you have some restrictions (e.g. walls), you have some possible choices with those different types of blocks that come in, and you have some freedom to choose where to put those blocks, but once you put those blocks in a specific way, you may trigger something funny to happen, like rows to be removed. But in programming we don’t have the time restriction of Tetris. That would be quite horrible, imagine, “you have 2 seconds to write this!” But without time, programming would be basically like this, where you’re solving that kind of stuff.

You have flow when you understand the system, that’s why you need to have the student part, and when you are putting those pieces together and being productive.

One thing that I noticed is that the right amount of limitations creates flow and focus. For instance, this is what we do in xstream: RxJS has like 150+ operators, and that may bring analysis paralysis where you are like “okay I need to build this stuff, let’s choose an operator”, then spend a lot of time choosing those. In xstream, what we did is that we have a little amount of operators, that focus your attention to these, it makes it easier to choose. If I ask you “do you want an apple or an orange?”, you might quite quickly choose something. But if I ask you “which of these 120 fruits do you want?”, you might choose something you are familiar with, without accurately analysing all of those fruits. But also, if you have too much limitation, then you create frustration, where you want to get stuff done but you are limited.

I noticed that there needs to be a balance there, and it’s about allowing creativity as much as possible, but creating focus. We can do that by basically adding cost, and you are going to default to what is cheaper. An analogy would be, if I would design a place for you to live, where there would only be healthy food grocery stores around you in 100 meters or less, and the only closest fast food or unhealthy food place would be 5 kilometers away, and you have to go by bicycle, then you would end up defaulting to healthy food. But if you really want unhealthy food, you can go and get it. So there is cost there, that tends to lead you to the cheaper stuff. With xstream, when you want an operator, you can just type dot, and it shows an autocomplete panel of the operators, which costs almost nothing because it’s very quick, or you can import an extra operator, and compose() it, there is a bit more cost there, so you end defaulting to the cheaper one. That helps, because we create focus, but don’t make it strict. We don’t make it as “besides these, you cannot use anything else, sorry”. We need to allow infinite possibilities, but not infinite paralysis when you’re analysing.

ASSUMPTIONS ABOUT TEAMWORK

A little on teamwork, code style bike-shedding brings little or no benefit to society. So Prettier for JavaScript is great. And is something that I actually asked on Twitter: what do you think is good teamwork, good pair programming and good code reviews? And this is a little bit of what people said: “Good code reviews focus on the macroscopic ideas in the code change, not the microscopic”.

DESIGN PRINCIPLES

So what kind of design principles I came up with? It’s pretty simple: if programming is about studying systems and solving puzzles, let’s make it easy to study code, and let’s make it easy to solve puzzles.

On writing (“solving puzzles”), I already gave some hints like having the right amount of limitations, and keeping yourself in the flow. I can show an example of the OP-1 device by Teenage Engineering, it is made in Stockholm, and it’s a music synthesizer.

op1

Usually synthesizers are huge machines with thousands of buttons, but this one just has these much. You can get super in the flow with this. You start fiddling with it, and you intuitively start learning “this does that”, “that does that”, and in five minutes you’re creating music. I’ve used other music tools and softwares, and when you decide “I’m going to make something”, then you spend two hours just configuring how the drum sounds. And then you lost the idea. Also with the OP-1, you have a connection to the radio, so you can sample the radio, or you can put a microphone here, so there is the plug for having infinite creativity, is that you can always plug something there and add it to this. So this is a big inspiration how you would solve puzzles is that you have very limited options, but with a plug for infinite creativity.

How can we make it easy to study code? I learned that readable code means quickly learnable code. It doesn’t mean so much about a variable name being “readable”, what only matters is whether you can spend five seconds with something and learn what it is. There are multiple ways we could achieve that, and naming is just one of those ways. For instance, some people comment everything in their code, then most of the code is just comments. Some people find that very unreadable, because you don’t see the actual code. But what if you could get all of those comments and hide them somewhere so that if you hover on that spot, you would see all the comments. Let’s say, this function would have some comments, and the programmer would have written “I know that we shouldn’t have this line here, etc”. But when you open that file you would only see code, if you want to learn about it, you could see those comments quite easily. We can think how do we show that in different ways.

Different people have different ways of learning. Some people prefer to fiddle with it in runtime, like Gleb Bahmutov demoed in a very nice way, some people prefer to have comments. There are different ways of learning. I prefer to just learn with diagrams, static diagrams, that’s why I’m focused on diagrams, but we could have all of these together. The goal is to support people to study code that they’re seeing.

SYNTAX

Of course, when we talk about language, people just imagine what is the syntax going to be like? Is it going to be like Haskell, like C?

I noticed that Haskell is great, I learned some Haskell, but it can be very scary, and people are very sensitive to syntax. I also noticed that it’s not that relevant. Of course it can help to make things shorter, but how much shorter? Is it 2 lines of code shorter, or 57 lines shorter? There is a big difference there. If it’s two lines, maybe that’s not a problem. For instance a lot of people find the Reason syntax more approachable than OCaml, and that’s why people are raging about Reason.

Also, who said that syntax needs to be text? Just like we have languages for humans, which are voice or text, we also have sign language, and it’s also a legitimate language. That’s why, when Nick mentioned about Bonzai, that’s a visual language, that can also be textual.

There are actually some projects that are very similar to what I imagine I would build. One of them is called Luna, and it’s a visual and textual functional programming language. Trust me, this is the Cycle.js idea. You have the dataflow graph and things are connected. This is Cycle.js, kind of. But it also has a dual representation, either graphs or code, so you can see whichever one you want. This allows better readability, you can go here or there, whichever you prefer, or both. This is the only thing we know about Luna, is the frontpage. It’s a private thing, we can’t fiddle with it. It would be really great if we could test it out and see if it can be something. They say that they are building interop for Haskell, Python, and they are thinking of C++ and JavaScript, so who knows.

Another language that inspired me was this one called Koka. It’s made by the person who wrote Parsec for Haskell (somehow you need to install both Haskell and Node.js). What’s really interesting about this language is that it kind of looks like typical C-style code or Python, and you might it’s just imperative code. What’s cool is that this is imperative code that sort of “transpiles” to fundamentally functional programming code. There are some cool ideas here, one of them which I really liked from this language which I would probably take as inspiration is, when you say s.encode(3), it doesn’t actually get the encode “method” on s, but it calls encode(s, 3). So you could use functions everywhere, but with a convenient and familiar dot syntax. So you could do s.encode(3).count.println and it actually does println(length(encode(s, 3))). You can use this very familiar syntax that Java programmers, C programmers, JavaScript programmers are comfortable with, but it’s functional piping. So I would take that into consideration, basically, trying to make something approachable in C-style, but fundamentally the concepts are functional and dataflow.

Another thing is when you input Chinese text, you are typing in one character group and you getting out something in another character group, what if we had something like this? You would type textual, and at the same time you would get visual diagrams getting built, on the side. So exploring all these kind of ideas is something that I have for the syntax.

What I think is that programming is not so much about putting words together, but about plugging ideas together. When you say something in English, of course there is the syntax and the grammar, but there is also the semantics and this subject, that verb, those are what are more important, and syntax can then be negotiated later.

NEXT

Will this happen? I don’t know, because languages are projects for 10 years or decades. You can’t just make a small language and use it, there are all kinds of questions about package managers and linting tools and testing tools. It’s just so deep, and I’m not sure if I have the time to build that. Which basically means “do you want to get married to a programming language?”. But also, transpiling languages can be small, or can be huge. PureScript and TypeScript are huge, in commitment and how much you need to do there, but some languages are small. It could be that we just have a nice syntax for producing a Cycle.js app. Will it happen? I don’t know. Thank you for listening.

Become a Patron!

Copyright (C) 2017 Andre 'Staltz' Medeiros, licensed under Creative Commons BY-NC 4.0, translations to other languages allowed.