C preprocessor tricks

The C preprocessor is a heritage from an ancient age (the 70′s). Modern languages provide better ways to do most things the C preprocessor was (and is) used for (the D programming language has removed the need for the preprocessor with “normal” import statements, support for conditional compilation using static if statements etc.), but in C you’re stuck with the preprocessor, however, it isn’t that bad and, as a matter of fact, it can do some rather neat things.

Remember that even though the C preprocessor has many valid uses, it can probably be abused in more ways than so, leading to weird errors or code that is hard to understand. The preprocessor has its uses, but often, you should simply just avoid it.

#include can include any kind of text, not just source code

The #include statement includes any kind of text, not just source code. For instance, you can use #include to include values for an array, assuming they are formatted as valid C code within a text file, e.g. 1, 2, 3, 4, 5. If this was stored in a file called values.txt, we could declare an array with these values simply by typing in the following in our C program:

int[] arr={
#include "values.txt"
};

For larger sets of data, this could come in very handy.

Put more complex #define values inside parenthesis

This is not okay:

#define TWO 1+1

First of all, you already have a constant for the value 2, it looks like this: 2. Additionally, you should not define it as 1+1, but instead simply as 2. But disregarding all those flaws, there is yet another kind of problem. If we try to multiply our TWO with say 3, we will get 4 as the answer, even though 2 times 3 is 6. Why? Because C changes 3*TWO to 3*1+1 and multiplication has higher precedence than addition. This is solved simply by adding parenthesis around the value in the definition.

This is an example of those funny bugs careless use of the preprocessor can cause.

Retrieving the name of a variable

By putting a hash (#) in front of a variable in a macro, you can add the identifier as a string literal into your code. The following is a macro that prints the name of a variable coupled with its value, which could be useful while debugging:

#define PRINT_VAL(val) printf("%s=%d\n", #val, val)

Note that the above example will only work with integers because of how printf() is called.

#error and #warning

You can use the #error and #warning directives to create your own compiler errors and warnings. For instance, if you would like to raise an error if someone tries to compile your program on Windows, you could add this little snippet of code:

#ifdef __WIN32__
#error "Disturbing, your choice of operating system is."
#endif

You should only do this if you have a valid reason (e.g. don’t compile on Linux if you use DirectX, since Linux doesn’t support that), I guess, even though the example above could arguably be a valid reason, depending on whom you ask.

#warning is used in the same way, though it will only cause the compiler to print a warning instead of the more severe error, which will halt compilation whereas a warning won’t. You could, for instance, set the compiler to print warnings whenever you are compiling with debug mode on, to avoid accidentally shipping the debugging version as production software.

Debugging preprocessor macros

Most compilers are able to do only the preprocessing step of the build, allowing you to see how preprocessor macros are expanded in your code, greatly easing the hunt for bugs related to the use of the preprocessor in the code. For GCC, the option to toggle this is -E.

Compiler specific preprocessor features with #pragma

The C standard includes the #pragma preprocessor statement for compilers to define their own preprocessor features. To see which pragma directives are provided by your compiler of choice, you should consult its documentation.

Most modern compilers support the #pragma once statement. This will tell the compiler to include this file only one, removing the need for (the much more verbose) header guards commonly used. Still, the traditional header guards have been used for so long that programmers probably won’t stop using them any time soon, even if this alternative would generally be superior (some legacy compilers may not support this, though).

__FILE__ and __LINE__

__FILE__ will insert the name of the current file, and __LINE__ will insert the the line number of the current line of the source code. These can be very useful for generating debug information, especially if you are unable to use a debugger (think kernel programming (technically, it is possible to use a debugger for OS kernels, but it isn’t as easy and practical as it is to use one for user-space programming)).

__DATE__ and __TIME__

__DATE__ and __TIME__ can be used to insert the current date and time (at the time of preprocessing) into the code, respectively. You could for instance show this information in the version information for your program:

printf("%s v%s\nCompiled on %s at %s\n", PRG_NAME, PRG_VERSION, __DATE__, __TIME__);

Void pointers in C

Void pointers are pointers pointing to some data of no specific type.

A void pointer is defined like a pointer of any other type, except that void* is used for the type:

void *pt;

You can’t directly dereference a void pointer; you must cast it to a pointer with a specific type first, for instance, to a pointer of type int*:

*(int*)pt;

Thus to assign a value to a void pointer, you will have to do something like:

*(int*)pt=42;
*(float*)pt=3.14; /* You can assign a value of any type to the pointer */

The use of void pointers is mainly allowing for generic types. You can create data structures that can hold generic values, or you can have functions that take arguments of no specific type. If you wanted a linked list allowing for generic values, you would define your list node like this:

struct ListNode{
  struct ListNode *next;
  void *data;
};

A generic function for doubling a value might look like the following:

#define TYPE_INT 0
#define TYPE_FLOAT 1
void doubleVal(int type, void *var){
  if(type==TYPE_INT){
    *(int*)var*=2;
  } else if(type==TYPE_FLOAT){
    *(float*)var*=2;
  }
}

Called, rather obviously, like the following:

doubleVal(TYPE_INT, &integer);
doubleVal(TYPE_FLOAT, &floatingPoint);

You can’t perform pointer arithmetic on void pointers since the compiler doesn’t know the size of the data which is pointed to, however, you may cast a void pointer to a pointer of some other type and perform pointer arithmetic on that.

Designing command-line interfaces

There are few articles on the design of command-line interfaces (CLIs), even if plenty of articles on designing graphical user interfaces (GUIs) exist. This article is an attempt at presenting some of the most important guidelines for CLI design.

The article assumes the command-line utilities are to be used on a *nix system (e.g. GNU/Linux, BSD, Mac OS X, Unix), and it will frequently reference to common tools on such systems.

Types of command-line interfaces

There are three major sorts of command-line interfaces:

  • Non-interactive
  • Interactive, line-based
  • Text-based user interface

Non-interactive programs don’t need any user interaction after invocation. Examples include ls, mv and cp.

Interactive, line-based programs are programs that often need interaction from the user during execution. They can write text to standard output, and may request input from the user via standard input. Examples include ed and metasploit.

Text-based user interfaces are a cross between a GUI and a CLI. They are graphical user interfaces running inside a terminal emulator. Examples include nethack and vi. Many (but not all) text-based user interfaces on *nix use either curses or the newer ncurses.

Non-interactive programs get the most attention in this article, while text-based user interfaces are barely covered at all.

Advantages of command-line interfaces

Why use command-line interfaces in the 21st century? Graphical user interfaces were invented decades ago!

Many command-line interfaces provide several advantages over graphical user interfaces, still today. They are mostly popular among power users, programmers and system administrators; partly because many of the advantages apply to their tastes well:

  • Ease of automation: Most command-line interfaces can easily be automated using scripts.
  • Fast startup times: Most command-line interfaces beat their graphical counterparts several times at startup time
  • Easier to use remotely: Maybe it’s just me, but I prefer to remotely control computers via SSH over VNC
  • Lower system requirements: The lower system requirements make CLIs useful on embedded systems.
  • Higher efficiency: Being able to do a lot by just typing some characters instead of searching in menus increases the efficiency which you are able to do work at
  • Keyboard friendly: Grabbing the mouse is a distraction

Disadvantages of command-line interfaces

Having covered the advantages of command-line interfaces, it would be a good idea to also cover the disadvantages of them. The most major disadvantage is that the learning curve is steeper. You will also have to check the manual in many cases, while in a GUI, you can figure out many more things while using the product.

GUIs also have the advantage when it comes to presenting and editing information that is by nature graphical. This includes photo manipulation and watching movies (I have always wanted a program that shows a movie in my terminal by converting it to ASCII art in real-time, that would be sweet).

Try to avoid interactivity

Interactive interfaces are harder to automate than non-interactive user interfaces. The ease of automation is one of the greatest advantages of command-line interfaces, and by making your utility interactive, you give away much of that advantage.

There will be cases when an interactive utility makes more sense than a non-interactive one, but at least don’t make a utility interactive when the advantages are dubious. You shouldn’t create a utility which just asks (interactively) for stuff that could be easily sent to the program as arguments (imagine mv asking you for a source and destination path interactively).

Naming your utility

I would like to stress on the importance of a good name for each command-line utility. This is because a bad name is easy to forget, meaning that users will have to spend more time looking it up.

The name should be short. A long name will be tedious to type, so don’t call you version control program my-awesome-version-control-program. Call it something short, such as avc (Awesome Version Control).

The name should be easy to remember. Don’t call your utility ytzxzy.

Arguments

A lot can be said about the arguments to give to programs. First of all, follow standard practice; single-letter options are prefixed with a hyphen, and multiple of them may follow directly (e.g. -la is the same as -l -a). Multi-letter options start with two hyphens, and each such argument must be separated with spaces. Look at the arguments you give to ls, or cp. Follow the way those work. Examples of commands following non-standard practice include dd, which has been criticized for that, many times.

Continuing on the theme of standard practice, if a similar tool to yours (or a tool in the same category, such as file management) uses some option for some thing, it could be a good idea to copy that behaviour to your program. Look at most *nix file management utilities such as mv, cp and rm. All of these provide a flag called -i, with the same behaviour; asking the user interactively to confirm an action. They also provide a -f flag, to force actions (some of which the computer would think seem stupid, but you know what you are doing while using that option, don’t you?).

Options should be optional. That is part of the meaning of the word, but sometimes it is forgotten. Command-line utilities should be callable with zero, nil and no options at all. Examples include cd, calling it with no arguments returns you to the home directory. Some programs might not make sense to call with no arguments, such as mv, but in many cases, you can find some sensible standard behaviour (remember that it is, however, better to just print an error and exit than to do something stupid the user would never expect to happen (“Oh, you gave rm no options, better remove every file on your disk, just to be sure that the file you wanted to remove gets removed”)).

In the *nix world, there’s a practice that anything after double-hyphens -- should be read as a filename, even if the name contains hyphens (e.g. cmd -- -FILE-). If the arguments list ends with a single-hyphen, input should be read from standard input.

Be careful using many flags whose only difference is the case (e.g. -R and -r), as it will be hard to remember.

Always provide a long form of short arguments. Don’t provide just -i, provide --interactive as well.

Provide --version and --help

There are two options you should always include in your program: --version and --help. The first one should print version information for the program. The second should tell one what the program is for, how to use it and present common, if not all, options.

Reading the input

Make sure your program can read input from pipes, and through file redirection.

If the name of a file is passed as an argument, read the file and use its contents as input. If no such argument was passed, read from standard input until a CTRL+D key sequence is sent.

Silence trumps noise

If a program has nothing of importance to say, then be quiet (the same applies to human beings). When I run mv, I don’t want it to tell me that it moved a file to some other location. After all, isn’t that what I asked it to do? It should come to me as no surprise that that happened, so I don’t need to be told that explicitly. When things you don’t expect to happen do happen, then should you break the silence. Examples include, the file I wanted to move didn’t exist, or I didn’t have permissions to write to the directory I tried to move the file to.

Note that your program don’t need to tell its version or copyright information during each invocation, or print the names of the authors, either. That’s just extra noise, wasting space, bandwidth during remote sessions and possibly making the output harder to automatically process, for instance by sending it to another program through a pipe.

Don’t tell me what the output is, either. One should know what the program they use do. whoami prints the name of the current user, and only the name. If it printed The name of the current user is: x instead of just x, much more work would be involved in extracting just the name.

Most programs provide -v (verbose) options making the program more verbose, and a -q (quiet) option, making the program shut up all together (except possibly for some fatal error). The default behaviour should not be completely silent in every case, but in most cases. Programs that print something should only print what is relevant.

How to ask users for a yes or a no

At times, your programs might need to ask a user for a yes or a no, for different reasons, the most common being confirmations (“do you really want to do this?”). Others could be problems that the computer may offer to fix (“Table USERS doesn’t exist, do you want me to create it for you?”).

While asking for questions requiring either a yes or a no (or a y of an n), you should put (y/n) after the question:

Do you really want to do this (y/n)?

Most programs require you to manually hit return after typing the letter. While this might seem superfluous, most programs do this, and your program should require user to do this as well in order to be consistent and not surprise anyone.

Tell me what kind of input you want, and how you want it

If your program asks for a date, and it doesn’t tell me how I should type that date, I will be confused:

Enter a date:

Tell me the format in which I should input the date, and the confusion is gone:

Enter a date (YYYY-MM-DD):

Likewise, if the program just asks for a length, I would be confused. In kilometers, meters, miles, feet…? When different units for things exist, tell me the unit.

Every program should do one, and only one thing, and do it well

The above heading is one of the most important parts of the Unix philosophy, and it has survived the test of time, being just as solid advice as it was back in the late 1960′s/early 1970′s. In practice, this means that you shouldn’t create a file management program, instead, you should create a program for removing files, another for moving them and another for copying them.

Doing one thing well is partly a side-effect of doing only one thing, which allows greater focus.

Understanding recursion

Recursion is something a lot of people find hard, even mystical. I learned to understand it, and even like it quite a bit, by learning Lisp. Recursion lets you solve problems in very elegant ways, and when you understand it, it will be really natural to you.

But the question is, how do you learn it? No, not just learn to know what it is, but really understand it. There is no magical formula which I have found, it just takes practice. As I said, learning Lisp taught me how to understand recursion. The reason was simply that Lisp more or less forces, or at least encourages, you to use recursion for solving many problems and thus forcing one to practice it.

What is recursion, really?

To solve problems using recursion, you want to have a function call itself, and for each call, solve a smaller problem than the initial problem in order to finally solve the big one. Take a simple example, a function for calculating the sum of all natural numbers up to some value of n (the example is in Java):

public static int sumTo(int n){
  if(n==1){
    return 1;
  }else{
    return n + sumTo(n-1);
  }
}

We could call this function, for instance by typing sumTo(5);. A smaller problem would be the sum of all natural numbers up to 4, or 3, or 2, or just one. This is exactly what this recursive function does; it solves each of the smaller problems.

The sum of all natural numbers up to 1 is 1, and if n is 1, that gets returned by our function (this is called the base case, which every recursive function should have in order to properly finalize execution). The recursive call occurs at line 5. It returns n added to the sum of all natural numbers less than n, thus what we get is roughly:

//The following is called stack winding, which is when we go to the base case
sumTo(5);
  5 + sumTo(4);
    sumTo(4);
      4 + sumTo(3);
        sumTo(3);
          3 + sumTo(2);
            sumTo(2);
              2 + sumTo(1);
                sumTo(1);
                  //The following is called stack unwinding
                  1
              2 + 1 = 3
          3 + 3 = 6
      4 + 6 = 10
  5 + 10 = 15

The above is called a stack trace; they can be useful while starting out with recursion, but when you get experienced with recursion, you won’t need them any more (you will just trust it).

Why do you need recursion? Couldn’t you write the previous function using iteration instead of recursion, like:

public static int sumTo(int n){
  int i, sum=0;
  for(i=1;i<=n;i++){
    sum+=i;
  }
}

Yes, you could. It would probably be faster, too (depending on how the compiler optimizes the code). Yet, there are cases when iteration just won’t work to replace recursion, or where iteration is really, really clumsy or inelegant (on the contrary, recursion is almost always elegant). One problem where recursion is (definitely) superior to iteration is the task of walking through all the directories in a filesystem. Say that we want to print every file on a filesystem, we would do it the following way:

walkthrough(file):
  print name of <file>
  if(<file> is directory):
    for files in <file>:
      walkthrough(files)

The first file would be the root directory, containing all other files and directories. For each file (and directory) found, the name is printed. If we find out that that the file is a directory, we go through its children (subfolders and files) and then, for each of them, we call the function itself, printing the name of each of the children and if any of the children are directories themselves, walkthrough() is called on them, printing their contents as well.

Practice (using Scheme)!

To learn recursion, you will need practice (I’m sorry, as I said, I have no magical formula). I would suggest you to learn Scheme, a minimalist dialect of Lisp. There is a famous computer science book for it, The Structure and Interpretation of Computer Programs (available online for free), and some free, online MIT courses using the book (by the authors of the book).

Use Scheme to practice problems using recursion. You can of course use another language, but I would say that some dialect of Lisp would be the best place to learn recursion, as the language itself is very well-suited for solving problems that way.

Some problems that can be solved recursively include:

  • Calculating the factorial of n (n!)
  • Calculating the sum of values in a list
  • Calculating the n:th Fibonacci number
  • Calculating the greatest common divisor of two numbers using the Euclidean algorithm
  • Traversing directories (write a real solution, don’t use pseudocode as I did)

You can find more information on recursion at Wikipedia, as well as examples.

Beginner Programming Mistakes

Interested in programming? Great! Programming is lots of fun. You want to do things well though, even if its not that important while just learning. I will list some of the beginner mistakes I did myself here, and that I have seen others do, just so you can avoid them.

Focus too much on programming languages

Many times have I seen people just getting into programming wondering what programming language to learn. Read this: languages aren’t that important. A good learning language would be Python (which is good otherwise, too), but other languages are good as well. Now the thing that makes a programmer good isn’t what language a programmer uses, similar to how the language a poet writes in isn’t making the poet a good or bad poet.

Don’t just learn a programming language either, learn good programming practices. You can learn such from, for instance, the book Clean Code.

Run straight into coding

Before writing even a line of code, plan how your application will work at a high-level on paper. I did this mistake many times, and I ended up throwing everything away and starting again because of that. How will your application solve a problem? Figure that out first.

Be too optimistic

Sure, we all want to create the next Google/Windows/World of Warcraft (I’m not so sure about Windows…), but that might be too much for a starter project. Google uses very complex algorithms (very complex things to make it work) to be able to find good results from the enormous web, operating systems, such as Windows, in general are complex and require a lot of detail knowledge about how computers work and complex games such as World of Warcraft aren’t easy at all, requiring lots of knowledge about heavy mathematics and physics.

I don’t want to scare you away, but don’t do it because you want to have that done. Sure, you might be working on things like that someday, but start small, and do it because you like doing it, and not for any other reason. How do you know if you like programming? Well, just try it out.

Also, a sidenote to all those interested in creating games; don’t learn C++ as a first language, it is very complex. You can create games in other languages too, even if the games industry doesn’t use them as much as C++. A good language is Python, and it provides something called PyGame, making game development a lot easier. If you are in for computer games, I would suggest the book Invent your own computer games with Python. Also, programming isn’t everything in the game world; if you aren’t great at mathematics, you might want to consider doing art for games or something else, because game development will have a lot of math.

Give up to early

Programming is hard and it takes a lot of time to get good at it. I have programmed for a few years, and I have a lot to learn. If you hit a stumbling block, don’t give up. Try to solve it. If you don’t like such stumbling blocks, you might not like programming, which, largely, is about problem solving.

Avoid challenges

Avoiding challenges is, to some extent, avoiding the ability to improve. Do projects that you know will be challenging. Do things that you aren’t sure how to do, and then learn how to do it by doing that. Practice makes perfect.

Skip the basics

You want a solid foundation. You won’t get a solid foundation by skipping the basics and moving directly to the “cool” stuff. I know, basics can be boring, but they are the foundation for good programming ability, and you don’t want to be a bad programmer, do you?

What this essentially means, you should know what “recursion” is before creating a graphical user interface.

The most useful GCC options and extensions

This post contains information about some of the most useful GCC options. It is meant for people new to GCC, but you should already know how to compile, link and the other basic things using GCC; this is no introduction. If you want an introductory tutorial to GCC, just google for that.

The things this article attempts to cover include (as well as a few other things):

  • Optimization
  • Compiler warning options
  • GCC-specific extensions to C
  • Standards compliance
  • Options for debugging
  • Runtime checks (e.g. stack overflow protection)

For more information on GCC, the freely available An Introduction to GCC book is pretty good. A manual with over 700 pages is available as well (it’s a reference, not a tutorial, though) from the GCC website. The manpages for GCC (man gcc) can also be useful.

Basic compiler warning and error options

An option many programmers always use while compiling C programs is the -Wall option. It enables several compiler warnings not enabled by default, such as -Wformat warning at incorrect format strings. To enable even more warnings, use the -Wextra option. All warnings can be turned off with -w. More warnings will make catching eventual bugs easier, but it may also raise the amount of false-positives. The exact implications of these different options, as well as individual warnings can be found here.

To treat compiler warnings as errors, use the -Werror option. To stop compilation when the first error occurs, use -Wfatal-errors.

Standards compliance

By default, GCC may compile C code that is not necessary standards-compliant, or it might not even compile code that complies to the C standard (“the C standard” here means either C89 or C99). Some C standard features are disabled by default, such as trigraphs (can be enabled with -trigraphs), and several GCC extensions (will be talked about later on in this article) will work, even if they aren’t parts of the official C standard.

The -ansi option can be used to make GCC correctly compile any valid C89 program (if not, it is due to a compiler bug). It will still accept some GCC extensions (those that aren’t incompatible with the standard); use the -pedantic option to make GCC a pedant when it comes to standards compliance. The -std= option can be used to set the specific standard. There’s a bunch of supported standards (and most standards have several valid names), but the important ones to know are c89 (equal to the earlier -ansi option), c99, gnu89 (C89 with GCC extensions, which is the default) and gnu99 (C99 with GCC extensions). You can also use c1x to enable experimental support for the upcoming C1X standard (or gnu1x for the same with GCC extensions).

Code optimization levels

You can set the code optimization level for GCC, which decides how aggressively GCC will optimize the code. By default, GCC will try to compile fast, thus no optimizations will be made. By setting an optimization level, GCC will spend more time compiling, and the code might be harder to debug as well, but optimize the code better, possibly resulting in a faster executing program and/or smaller binary filesize. Because of longer compile-time and possible complications making debugging harder, it can be a good idea not to optimize during the development process and wait with that for building the production binary.

The default optimization(less) level is set with the -O0 option, or by giving no optimization option at all.

Some of the most common optimization forms can be activated by using the -O1 (or simply just -O) option. This option tries to produce smaller and faster binaries, and in many cases it can compile faster than -O0 because some optimizations will simplify the program for the compiler as well.

The next level is, perhaps unsurprisingly, -O2. It tries to improve the speed of programs even more than -O1 does, without increasing the size. It can take a much more considerable amount of time to compile. This is recommended for production releases as it optimizes well for speed without sacrificing space.

The -O3 level does some of the heaviest, most time consuming optimizations. It may also increase the size of the binary. In some cases, the optimizations may backfire and actually produce a slower binary.

If you want a small binary (most of all), you should use the -Os option.

Just pre-process, compile or assemble

When you ask GCC to compile a C program, the following steps are usually taken:

  1. Pre-processing
  2. Compilation
  3. Assembling
  4. Linking

For different reasons, you may want to stop at some of the steps. You might just want to pre-process, for instance, to find an error you suspect comes from a faulty pre-processor directive. If you do so, you will see the output from the pre-processor instead of getting the complete finished binary. Likewise, you may wish not to link because you are going to link manually later on, or maybe you just want to get the assembly output, modify that in some way and then manually assemble and link it. The reasons why you would want to do that isn’t the important thing here though, but how to do it.

To only pre-process, you should use the -E option. To stop after compilation, use -S. To do all steps but linking, use -c.

Controlling assembly output

Normally GCC produces AT&T syntax assembly output, but if you want to use Intel’s syntax (which is, in my opinion, much more readable), you should set the assembly dialect with the -masm= option, with intel as the value (-masm=intel). Note that this won’t work on Mac OS X.

A useful option for making the Assembly code more readable is the -fverbose-asm, which adds comments to the assembly output.

Adding debug information

If you are going to debug your program later, and don’t want to debug the assembly version, the -g option is absolutely essential. It adds debugging information, so that you can do source-level debugging later on the binary. The -g option produces debug information specifically for GDB, so what you get will not necessarily work on other debuggers.

You can set the level of debug information to generate. The default level is 2. With -g1 you can inform GCC to produce minimal debugging information, and with -g3 you can tell GCC that you want even more debug information than what you get by default.

Adding runtime checks

GCC can add different runtime checks to C programs at compilation, making debugging easier and avoiding some of the most common security vulnerabilities in C programs (as long as vulnerabilities/bugs don’t exist in the checking…). Note that runtime checks can degrade performance of programs.

There is an incredibly useful GCC option, -fmudflap, which can be used to find pointer arithmetic errors during runtime. This can help you find many pointer arithmetic related errors.

Stack overflow protection can be enabled by using -fstack-check.

The -gnato option enables checking for overflows during integer operations.

GCC extensions to the C language

GCC provides several extensions to the C programming language that aren’t actually parts of the C standard. You should always be careful while using non-standard features as that would, in most cases, make your code incompatible with other compilers. Anyway, I will cover some of the most useful extensions GCC provides, and you decide if you use them or not.

All extensions can be found in the GCC documentation.

Likely and unlikely cases

One GCC extension frequently used in the Linux kernel is the GCC extension __builtin_expect option, commonly known as the likely() and unlikely() macros. The Linux kernel would use something like the following for telling GCC which if statements are likely and unlikely to execute, so that GCC can do better branch prediction:

/*This is the likely case which will occur most of the time*/
if(likely(x>0)){
  return 1;
}
/*This is the unlikely case which will occur much more
 *seldom than the earlier case*/
if(unlikely(x<=0)){
  return 0;
}

The likely() and unlikely() macros are defined in the Linux kernel as:

#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

If you want to use this outside the Linux kernel, you could always type __builtin_expect(condition, 1) for likely cases and __builtin_expect(condition, 0) for unlikely cases, but it would be much easier to use the same macros as the Linux kernel uses.

Additional datatypes

GCC provides some additional datatypes to the C programming langauge not defined by the standard. These are:

Note that 80- and 128-bit floating point values are not supported on all architectures (they are supported on common x86 and x86_64 systems, though). On ARM platforms, half-precision (16-bit) floating points are supported.

Ranges in switch/case

A GCC extension provides the support for ranges in switch/case statements, so you can have a case for values between 10 and 1000, for instance. A range is defined as x ... y, where x is the lower-bound and y is the upper-bound. You may not leave the spaces before and after the dots out. An example switch statement using cases with ranges:

switch(x){
  case 0 ... 9:
    puts("One digit"); break;
  case 10 ... 99:
    puts("Two digits"); break;
  case 100 ... 999:
    puts("Three digits"); break;
  default:
    puts("I sense a disturbance in the force");
}

This is more convenient than writing 1000 different cases (but you wouldn’t solve the problem like that, would you?).

Binary literals

GCC supports binary literals in C programs using the 0b prefix, pretty much like you would use 0x for hexadecimal literals. In the following example, we initialize an integer using a binary literal for its value:

int integer=0b10111001;

10 ways to improve your programming skills

1. Learn a new programming language

Learning new programming languages will expose you to new ways of thinking; especially if the new language uses a paradigm which you aren’t yet familiar with. Many of the ways of thinking that you will learn can be applied to languages that you already know, and you might even want to start using the new language for serious projects as well.

Good languages providing a great educational experience (but not necessarily limited to that) include any Lisp (Scheme is good), Forth, PostScript or Factor (stack-oriented programming languages), J (wonderful array programming language), Haskell (strongly typed purely functional programming language), Prolog (logic programming) and Erlang (concurrent programming goodness).

2. Read a good, challenging programming book

A lot can be learnt from books. While practice is important, reading one of the really good and challenging programming books can be a great way to challenge your thinking and even move it up by a level. Such challenging books would include The Art of Computer Programming (if you want a real challenge), Structure and Interpretation of Computer Programs (SICP), A Discipline of Programming or the famous dragon book.

You can go with less challenging books as well, but avoid books that are “for Dummies” and will teach you something “in 24 hours” or “in 21 days”; you will get very little out of such books in terms of improving programming skills.

3. Join an open source project

What are the advantages of joining an open source project? You will work with others (good thing in case you have only worked on personal projects before), and you will have to dig into, and learn to understand, an unfamiliar code base (which can be challenging).

You can find different projects on sites such as GitHub, Sourceforge, gitorious, BitBucket or Ohloh.

4. Solve programming puzzles

You can always solve programming puzzles, and many such exist. Math oriented problems can be found at Project Euler, which is, probably, the most popular site for coding puzzles.

You should also try out code golf; a programming puzzle where programmers attempt to solve a given programming problem with the least amount of keystrokes. It can teach you many of the more esoteric and special features of the language, and you will have to think creatively about coding (and it is fun!).

Programming puzzles, mainly code golf, is found at codegolf.stackexchange.com.

5. Program

Start writing a program, from scratch. Design all of the architecture and implement it. Repeat.

Coding is best learned by coding. You will learn from your own mistakes, and finishing a project is motivating and much more fun than reading a book is.

6. Read and study code

Study famous software programs, such as the Linux kernel (be warned, it is huge). A good operating system for educational purposes is MINIX3. You can learn many new language idioms, and a thing or two about software architecture. Reading unfamiliar source code is daunting at first, but rewarding.

You can also increase your understanding of some API you use, or a programming language, by reading its implementation.

7. Hang out at programming sites and read blogs

Hanging out at different programming sites (such as forums and StackOverflow) will expose you to other programmers and at the same time, their knowledge.

Also, read blogs, maybe this (if you want) and preferably more. Good blogs are Joel on Software (although he doesn’t blog any more, jewels exist in the archives), Coding Horror and Lambda the Ultimate.

You should also follow news.ycombinator.com.

8. Write about coding

Start writing about coding on a blog, even if it is just for yourself. You can also write answers on different Q&A sites, forums or you can write tutorials at some sites (e.g. DreamInCode). When you write about coding, you want to make sure that use the correct terminology and know the why in order to explain problems and techniques. It also lets you reflect on your own programming knowledge and improve your English language skills, which is important in programming.

9. Learn low-level programming

Learning low-level programming and such languages is also useful for achieving a better understanding of the underlying machine. Check out C, and maybe even learn some processor’s Assembly language.

Learn how a computer executes a program and how an operating system works (at a high-level, at least). If you really want to go serious about low-level programming, you can read books on computer organization, operating systems, embedded systems, operating system driver development and so on (I’m reading such books at the moment).

10. Don’t rush to StackOverflow. Think!

So you have a problem with your code and you have been trying to solve it for half a minute. What do you (hopefully not) do? Run over to StackOveflow. Don’t. Spend a good deal of time trying to solve the problem on your own, instead. Take a paper and a pencil, start sketching a solution. If that doesn’t work, take a short break to keep your mind fresh and then try again.

If after spending an hour (or some other considerable amount of time, depending on the size of the problem) of trying to solve the problem, then you might go over to StackOverflow, but think on your own first.

The J Programming Language: An introduction and tutorial

I have been reading about the J programming language lately. It is quite different from many other languages, but I will try to cover the absolute basics of it, while, inevitably, much is left out. In the end, this article tries to give you a feel of the language and get you started, other resources are available for achieving higher proficiency in the language.

The language has very good resources available, so kudos for that. Freely available books include Learning J and J for C Programmers.

J is an offspring to the APL programming language. One of the most important changes from APL to J is that you can now use a standard keyboard (without any extra character map or similar trick) to type the code, as J, unlike APL, uses only pure ASCII characters.

J is a member of the array programming paradigm. Other notable paradigms J is a member of are the functional, function-level and tacit programming paradigms.

Just recently, in March 2011, J was released as open source software under the GNU GPL v3.

J is suitable for mathematical and statistical computation. It also happens to be good for code golf, as a bonus, because the language is very terse. J provides standard libraries, an integrated development environment, bindings to other programming languages and built-in support for 2D as well as 3D graphics. Plotting features exist, and I believe J could be used as a Maxima or Mathematica replacement in some cases.

J can be run interactively, which greatly helps while learning and/or trying out new techniques. It is recommended that you download and install J on your system and try out the material in this article in order to achieve a better understanding of the language.

Basic symbols

The + symbol is used to add two numbers together, one on the right and one on the left, e.g. 8 + 4. Spaces are optional (in this case), and can be used to increase clarity. Subtraction uses the - symbol and multiplication *. Division uses %, not /. C and most other common programming languages use / for division and % for modulus, but J doesn’t (| is used as modulus in J).

You can chain symbols for more complex expressions, e.g. 2 * 3 + 4. You might think that this would produce the result 10, as multiplication usually has higher precedence than addition, but J doesn’t have any precedence rules, and evaluates everything from the right to the left (this is also unlike mathematics, where evaluation happens from the left to the right when operators have the same precedence), thus, 2 * 3 + 4 evaluates to 14. Symbols lack precedence rules in J because J has hundreds of them, and it wouldn’t be fair to have programmers remember the precedence rules for all of them. You can, however, use parenthesis to change precedence just like in mathematics. To make 2 * 3 + 4 evaluate to 10, type (2 * 3) + 4.

Negative numbers are prefixed with an underscore (_), not a hyphen as in most other languages. 9 -4 will evaluate 9 subtracted by 4, while 9 _4 is a list containing the elements 9 and negative 4 (lists will be explained later).

Comments

Comments in J are started with NB. and continue towards the end of the line. An example is NB. This is a comment (don’t miss the dot in NB.).

Monads and dyads

Monads are functions with values on just the right side, while dyads are functions with one or more values on both sides of the function. Symbols often have separated monadic and dyadic cases, where the symbol will act in different ways if the symbol is used as a monad or as a dyad. As an example, - used as a dyad (x - y) is the familiar subtraction symbol, but used as a monad it is the negation symbol (e.g. - 9 _4) negates the two values, returning _9 4.

Lists

Lists are one-dimensional arrays. A list can be constructed by placing values separated by spaces after each other, as in 2 9 4. 2 9 4 is a list containing 2, 9 and 4. We can add something to each value of the list by typing 2 9 4 + 42; here we add 42 to each item, producing a new list 44 51 46. You can also type something like 1 2 3 + 4 5 6, which will produce 5 7 9 (1+4, 2+5, 3+6).

To get the length of a list, use the # symbol (as a monad), followed by a list.

Verbs and adverbs

J categorises many parts of the language into classes that are much like the classes of different natural languages. Among these are verbs and adverbs, which relate to functions. A verb is pretty much a normal function, such as +. An adverb is a special function that takes another function from the left and changes the way that function gets applied. An example of such a function is /.

/ takes a verb and then applies that to all the elements of a list. For instance, + / can be used to sum the values of list elements as + / 8 4 2 4. Likewise, you can multiply all elements together with * / and so on.

Another example is ~, which can be used to swap left-side and right-side arguments. For instance, 7 %~ 9 divides 9 with 7, not 7 with 9, as the sides the arguments resided on were swapped.

Assignment

J uses =: for assignment (and =, not ==, for equality comparison). We can, for instance, assign the value 60 to the identifier x, by typing x=:60. Note that the equals sign comes first, which differs from the standard practice of putting the colon first; something that had me confused for a while.

Tables and multidimensional arrays

Using the dyadic $ symbol, you can construct multidimensional arrays. Arrays that have exactly two dimensions are called tables.

We can construct a table using 3 3 $ 1 2 3 4. The numbers on the left side specify the length of the different dimensions. Each number in that array represents the length of a specific dimension, and the amount of numbers sets how many dimensions will be used in the array. With 3 3, we create a 3 by 3 table. The numbers on the right specify the values which will be used as elements in the array. A 3 by 3 array would contain 9 elements, but we only specified 4 in the right-hand array. In that case, the array values we have will be repeated, forming the following table:

1 2 3
4 1 2
3 4 1

If we want to loop some values in a list, we can also use $ for that, e.g. 10 $ 1 2 3, producing 1 2 3 1 2 3 1 2 3 1.

Conclusion

J is definitely the coolest programming language I have touched recently (and among the languages I have recently played with include hyped languages such as Go, Haskell, Scala, D and Clojure, many which impressed me, but not as much as J). I look forward to learning more about J, and I hope you will do so as well.

I have not yet used it for anything non-trivial, and I will see how well it will do in such tasks, but so far, I’m pretty damn impressed.

The Vim options you should know of

Vim is a highly configurable editor, and if you can’t configure something as you want, plugins are available. While this post won’t cover plugins, it will cover the most useful options you can set in Vim. This is meant for beginner users of Vim; advanced users probably know all of the following already.

To make options permanent, you can modify your ~/.vimrc file. Type vim ~/.vimrc, type in the commands (one per line) and save.

Line numbers

Showing line numbers in a paragraph to the left can prove useful when programming. To do so, use :set number. If you want to disable that, just type :set nonumber.

Highlight search results

To enable highlighting of search results, type :set hlsearch, and to disable, type :set nohlsearch.
:set hlsearch

Auto- and smartindent

Autoindent will put you at the same indention as the previous line when opening a new line below. To enable autoindent, use :set autoindent, and to disable it, use :set noautoindent.

Smartindent is more advanced, and will try to guess the right indention level in C-like languages (only). To enable smartindent, type :set smartindent, and to disable it, type :set nosmartindent.

Change colorscheme

Use the :colorscheme {name} command to set the active colorscheme for Vim to {name}.

Adjust colors for dark or bright backgrounds

If you have trouble reading text as it is too dark on your dark terminal background, you might want to use the :set background=dark option.

Result of :set background=dark:
:set background=dark
Result of :set background=light, and also Vim’s default:
:set background=light

To set colors for a bright background, use :set background=light.

Enable incremental search

To enable incremental search, type :set incsearch. With incremental search enabled, Vim will find results as you type.
Incremental search
Incremental search can be disabled with :set noincsearch.

Ignore case in searches

If you want to ignore case in searches (e.g. be able to search for vim and find Vim, VIM and other variations), you can use :set ignorecase. To start treating letters with different case as different letters again, type :set noignorecase.

Enable spell checker

To enable Vim’s spell checker, type :set spell. For help on the commands to use with the spell checker, type :help spell. You can disable the spell checker with :set nospell.

Vim's spell checker

The spell checker doesn’t check spelling in code, just comments and non-code files.

Book recommendations for C programmers

The following is a list of recommendations on good reads for programmers in the C programming language:

The C Programming Language, aka the K&R

The C Programming Language
The classic book, describing all of ANSI C in roughly 200 pages. Written by Dennis Ritchie, who created C, and Brian W. Kernighan. Definitely a book that every C programmer should read and have in their library.

C Programming: A Modern Approach

This is the book I learned C from, and it is definitely one of the best technical books I have ever read. It is around 900 pages long, and very comprehensive. It contains both C89 and C99, and it also tells you about coding best practices and warns you about common gotcha’s.C Programming: A Modern Approach
It uses graphics to explain many concepts and in general, it is easy to read and understand, while it doesn’t skip details because the author felt like they were “unnecessary”. It has many and high-quality exercises and programming projects, and each chapter ends with a question-and-answer part where common questions are answered. If anyone asked me about a book to learn C from, this is what I would suggest.

The C Puzzle Book

The C Puzzle BookThe C Puzzle Book engages the reader in some C puzzles, where knowledge of C’s darker corners might be necessary. It is an entertaining read, and you will learn a lot about C from it.

The Standard C Library

The Standard C LibraryThis, rather old, jewel explains all of the C standard library and how to use it, but it doesn’t stop there: it shows full sample source code for all of the standard library as well! If you want to understand C’s standard library, this would be the book to get.

Expert C Programming

Expert C ProgrammingExplains how to code like a C expert (as far as a book is able to explain that, the rest will be about practice, practice, practice…). Tells you about the secrets making a programmer an expert at C.

C in a Nutshell

C in a NutshellThis is the book that I use as my C reference and the book which I look into when I need documentation (and don’t have a working Internet connection and manpages aren’t sufficient).

C Traps and Pitfalls

C Traps and PitfallsC isn’t a language that is going to play nice with you. It has many hazardous (if not aware of) hidden traps. This book dissects those, making you a more confident C programmer (hopefully). Maybe C will stop blowing up in your face as well ;-)

Don’t forget…

These were all books on C, but the programming language doesn’t make the programmer, or the knowledge of it. You will need practice, and knowledge on many other topics, such as algorithms, data structures and program design, but these books will give a solid foundation on the C programming language.