C Programming – Why Mention Data Type of Variable in C?

cdata typesdeclarationsiovariables

Usually in C, we have to tell the computer the type of data in variable declaration. E.g. in the following program, I want to print the sum of two floating point numbers X and Y.

#include<stdio.h>
main()
{
  float X=5.2;
  float Y=5.1;
  float Z;
  Z=Y+X;
  printf("%f",Z);

}

I had to tell the compiler the type of variable X.

  • Can't the compiler determine the type of X on its own?

Yes, it can if I do this:

#define X 5.2

I can now write my program without telling the compiler the type of X as:

#include<stdio.h>
#define X 5.2
main()
{
  float Y=5.1;
  float Z;
  Z=Y+X;
  printf("%f",Z);

}  

So we see that C language has some kind of feature, using which it can determine the type of data on its own. In my case it determined that X is of type float.

  • Why do we have to mention the type of data, when we declare something in main()? Why can't the compiler determine the data type of a variable on its own in main() as it does in #define.

Best Answer

You are comparing variable declarations to #defines, which is incorrect. With a #define, you create a mapping between an identifier and a snippet of source code. The C preprocessor will then literally substitute any occurrences of that identifier with the provided snippet. Writing

#define FOO 40 + 2
int foos = FOO + FOO * FOO;

ends up being the same thing to the compiler as writing

int foos = 40 + 2 + 40 + 2 * 40 + 2;

Think of it as automated copy&paste.

Also, normal variables can be reassigned, while a macro created with #define can not (although you can re-#define it). The expression FOO = 7 would be a compiler error, since we can't assign to “rvalues”: 40 + 2 = 7 is illegal.

So, why do we need types at all? Some languages apparently get rid of types, this is especially common in scripting languages. However, they usually have something called “dynamic typing” where variables don't have fixed types, but values have. While this is far more flexible, it's also less performant. C likes performance, so it has a very simple and efficient concept of variables:

There's a stretch of memory called the “stack”. Each local variable corresponds to an area on the stack. Now the question is how many bytes long does this area have to be? In C, each type has a well-defined size which you can query via sizeof(type). The compiler needs to know the type of each variable so that it can reserve the correct amount of space on the stack.

Why don't constants created with #define need a type annotation? They are not stored on the stack. Instead, #define creates reusable snippets of source code in a slightly more maintainable manner than copy&paste. Literals in the source code such as "foo" or 42.87 are stored by the compiler either inline as special instructions, or in a separate data section of the resulting binary.

However, literals do have types. A string literal is a char *. 42 is an int but can also be used for shorter types (narrowing conversion). 42.8 would be a double. If you have a literal and want it to have a different type (e.g. to make 42.8 a float, or 42 an unsigned long int), then you can use suffixes – a letter after the literal that changes how the compiler treats that literal. In our case, we might say 42.8f or 42ul.

Some languages have static typing as in C, but the type annotations are optional. Examples are ML, Haskell, Scala, C#, C++11, and Go. How does that work? Magic? No, this is called “type inference”. In C# and Go, the compiler looks at the right hand side of an assignment, and deduces the type of that. This is fairly straightforward if the right hand side is a literal such as 42ul. Then it's obvious what the type of the variable should be. Other languages also have more complex algorithms that take into account how a variable is used. E.g. if you do x/2, then x can't be a string but must have some numeric type.