After giving some talks about Clean Code I have decided to summarize the most important things in an article. Because there is a lot of posts and information in the net about Clean Code, I think that a new article talking about it simply explaining some the principles is not going to be very worthy.
In this article I will try to give you a practical approach to Clean Code. I will not go deep into the theory, I want to show how I write Clean Code.
What is Clean Code and why should we care? Origin and a definition It’s mandatory to talk about the book written by Robert C. Martin in 2008 with the same name. But there are more books and experienced developers talking about similar concepts before the first release of the book.
I have built a kind of definition of Clean Code joining the opinions of several authors and sources and what I get is that Clean Code has these features:
It is important, at least as important as other concepts like performance, cover the functionality, avoid bugs, … It is easy for any developer to read. It is easy for any developer to modify. It was written by someone who cares it. It does what is expected. The code does not fool you, no surprises. Why you should write clean code I really believe that writing Clean Code is important because it’s the first step to cover the main goal of any architecture: minimize the human effort needed to create and maintain the required system.
When we are coding, we use to spend more time (much more time) reading code than writing. We read legacy code, libraries code, your team mates code, code written by you several months ago (that you don’t remember), code written by someone who left the company, code in Stack Overflow… Robert Martin puts some numbers to this:
“Indeed, the ratio of time spent reading versus writing is well over 10 to 1” — Robert C. Martin, Clean Code
Taking this in consideration, is not it worth a little extra writing effort? You will get back several times the extra time you spend cleaning your code. Think about it the next time you are going to commit a piece of code.
Last thought about why we should care our code is that we, the developers, do not write code solely to be interpreted by machines. We write code to be read by humans. Let’s think that we are authors, in the same way that a journalist writes in a newspaper or a writer creates a novel.
Some principles There are many principles and ideas about what is clean and what is not. You can find them in the book and also on the net. But because I need some basic principles to have something to play with, I am going to introduce some of them. If you already know them you can skip to the next section, if not I think they will allow you to get an idea of the style of the rest of the theory and, perhaps, the motivation to know more.
So, let’s see some principles with code.
Naming We are deciding names everywhere when we are coding: variables, functions, classes, packages, files… Taking it seriously is a first step to having Clean Code.
Some tips to get clean names:
Use intention-revealing names:
Choose pronounceable names:
Use searchable names:
Avoid prefixes and suffixes and abbreviations
There are many other tips about choosing good names. Probably use intention-revealing names is the most important but what is really important is to take the naming seriously.
Take your time to choose good names. And don’t be afraid to change an existing name if you think that the code is going to be more readable after the change.
Functions Between all the ideas about clean functions I am going to highlight three for this post.
First rule is basic and easy to remember: Functions should do one thing, they should do it well and they should do it only. Not too much to explain here: avoid side effects, split your functions if you notice that they are doing several things at the same time.
Functions should be small. Ok, but how short they should be? How to measure it? Let’s force our functions to have no more than 2 levels of indentation. If this is hard for you, you can start setting a higher limit (3 levels, for example), but please put a limit on the levels of indentation you allow. Regarding the arguments… The fewer arguments a function has, the cleaner it is. Why? Arguments need a lot of context knowledge. In each call a reader must have context to understand each argument. More arguments→ more context you need to understand. Arguments are also hard from a testing point of view. More arguments, more test cases to ensure that all the combinations of arguments work properly
Comments When I learned to code in the 90's, my teachers used to ask me to write comments everywhere. It was typical to hear them say things like “If you don’t comment your code, I’m not going to correct your exam…”. The goal of those comments was to make our code easier to read. This is a goal that we also have when writing Clean Code, but maybe comments are not the best way to achieve it.
“Comments compensate for our failure to express ourselves in code. Comments are always failures” — Robert C. Martin, Clean Code
I agree with Martin in this point. He also says that comments lie, and I am sure that we have all found old comments saying something outdated. Because the code is maintained, but there is nothing that forces the comments to be maintained as well. Is there anything worse than a fake comment?
The truth is only in the code. When you think you need to write a comment, always think if there is not a better way to express this using the code.
The most important idea is to try to avoid comments to explain the code. For example:
Avoid comments to explain a variable. Instead, choose a good name for this variable and you won’t need a comment Avoid comments to explain a function. Instead, force your function to do only one thing, have few arguments and choose a good name for it and its arguments and you won’t need a comment Let’s see a practical case:
- We have something complex. We feel that is going to be hard to understand in the future:
- We can add a comment to make it easier to understand:
- Let’s try another option, let’s extract the complex code to a method with a cool name:
Think about what a future reader will want to know when they find this if. They will be interested to know that this if is checking if the year is a leap year, but probably they will not be interested in how we are getting it. If they are curious, they can navigate to the implementation of this coolly named method. Unintentionally, by avoiding a comment we are separating different levels of abstraction in our code.
So, in general, avoid comments to explain code. And, because you will be using GIT or any other distributed version control system, avoid commented-out code (delete it!), attributions and bylines, and this kind of stuff.
Comments are not forbidden, there are situations where comments make sense:
Legal comments TODO comments Amplifications about the importance of something or the reason of a concrete decision in the code Mandated comments (JavaDocs, …) in public APIs (but, avoid this in non-public code… don’t force a team to comment the all the functions of all your classes!!!) More principles As I told before, I was going to comment just a few principles to get a basic understanding of the Clean Code theory. So, these are only some of the most basic ideas about Clean Code. If you find them interesting and you want to know more, look for more resources on the net or, directly, read Martin’s book.
How to clean your code? Refactor is the key Well, good naming, small functions, no comment to explain the code… got it. But, how can I do it? How can I write code following these ideas?
Our job is hard in itself. Writing code and getting it to work can already be enough challenge. And that without worrying about leaving it clean.
So, the keyword here can be Refactor. A good way can be to write code without worrying too much about cleanliness, and later, when we have the code doing what we want, clean it up with a refactor.
Definition of Refactor Code refactoring is the process of restructuring existing code without changing its external behavior. It means that the code before and after the refactor must do exactly the same.
Things that are not refactors:
Change an algorithm Replace one type of loop with another Upgrade the performance of a piece of code Things that are refactors:
Extract a piece of code to a Function Rename things Extract several functions to a new Class. Create a constant to store a hardcoded value … Safe Refactoring Maybe you are thinking that “I don’t want to break the code, it’s working fine!”. Yes, of course. It could have been quite difficult to get the code working, we don’t want to break it when refactoring. And this is the typical reason people argue for not changing very bad code. But don’t worry, there is safe way.
We can rely on two things to refactor without fear:
Testing: we should have good automated tests for our code for many reasons. But it is obvious that this is something that will help us refactor without breaking nothing. After each refactoring you can check if all the tests are still green. I am not going to write about testing in this post, maybe in another one in the future. If you don’t know about testing, you should. There is a lot of information on the net. If you don’t know where to start, ask me :).
Refactoring tools: modern IDEs have tools that do some of the most common refactoring actions automatically. If we use them we will be reducing the possibilities of breaking something when making changes to the code. I am going to introduce them at the end of this post ;). When should you Refactor your code? All the time. I mean, you should be working on development cycles of:
Write code Write tests Refactor And not necessarily in this order. You can develop using TDD and, in this case, you will write your tests before the code. But anyway, you should refactor each time you have a piece of working code. In other words: you should refactor at the end of each cycle. And these cycles should be small.
Because if you work inside a small iteration of development it will be very easy to refactor, to ensure that everything is clean, that you are not breaking things, etc… Do you usually spend several days writing to end up making a delivery with a lot of lines of code in several files? Maybe not the best habit.
Appendix: Refactoring tools For this appendix I have chosen an IDE that I am comfortable with, such as JetBrains’ IntelliJ for Java. But you will be able to find these kind of tools in the IDE you use for your preferred language. If not, maybe you should try another IDE.
Rename Probably the simplest refactoring tool is Rename. You have an entity with a name that you don’t like and you want to change it. Of course, you can edit it manually... but this won’t be trivial because this entity can be used in a lot of places.
For example: I want to change the name of the class Input. I want to call it WordFrequency.
Because it is a Class, there are a lot of potential places where I should change Input identifier. Manually, I should look for all of them. But we have Rename refactoring tool:
This tool will rename the entity and it will be in charge of rename everything we need: other usages, the file name, tests… even variable names that could have relation with out entity: