Reference Variables

Technical Pre-amble

In computer systems, data is ultimately represented by some representation of patterns of 0s and 1s. Physically, on the hard drive a 1 is two consecutive regions with relatively high magnetism, and a 0 is a relatively strong magnetic region followed by one that is not. And physically, in RAM, a 1 is a very small electronic switch that is on, and a 0 is one that is off. Depending on the context – numbers & words, colors or sounds, a certain combination, like 0000000001000011 might be interpreted as ‘C’, or as the color “light-turquoise”, or as the pitch C# (554.37 Hz). More on all of this when we get into Topic III of the IB syllabus…

But an important point is that these combinations of 0s and 1s – this computerized data - exists at specific places in computer memory. By knowing the address of the data, the computer can access that data. Memory addresses are usually written using the numbering system hexadecimal. Hexadecimal numbers are numbers which use 16 symbols: 0 1 2 3 4 5 6 7 8 9 a b c d e f. So a hexadecimal number might look like 01A32D, or AAA999. The reason hexadecimal is used form memory addressing is that they represent memory values in a significant way. Most computers use 32-bit memory addressing, which means 32 bits, i.e. 32 0s and 1s, are used for one address. With 32 bits, the largest memory address would be 2³², or 4,294,967,296. The same number can be represented in hexadecimal with the much more significant 100000000.

In programming, a variable is a shortcut for a memory address. We make use of variables for a couple of reasons. Directly using the hexadecimal value would be cumbersome and seemingly random. Can you imaging going System.out.println(A25D73); and int BBB123 = 42;? But, further, by creating our own variable names, we can call them anything we want, and so make them descriptive names, such as studentName, or rectangleLength.

Variables as Data: “Literal Variables”

Variables can represent two general things: 1. a piece of data itself, literally, or 2. simply a reference to data. When it’s of the literal variable kind, the data it represents will be one or another primitive type. The Java primitives we’ve looked at so far are int for integers, float and double for real numbers and boolean for Boolean (true/false) values. And now would be a good time to throw in one more: char. A char is a 16-bit primitive data type that holds one character – be it a letter, a single number, or a symbol, such as ^ or ! or &. So, a literal variable is a shortcut for a memory address at which starts an actual piece of data. If we take, for example, an int variable, then starting at that memory address and continuing on for 31 more bits, the 0s and 1s are there to be interpreted as an integer value.

Variables as Objects: “Reference Variables”

A reference variable is a variable that is not literally a certain value; rather, it is a reference to some other data. So what a reference variable “holds” is actually another memory address. Since most computers use 32-bit memory addressing, that makes a reference variable 32 bits long. Though, interestingly, when you println() a Netbeans reference object, it shows these references as being six hexadecimal digits (for example @A18B3D – the @ symbol means it’s a memory address which follows.) So that means Netbeans is using 24 bits addressing, since the largest number that six hexadecimal digits can produce (FFFFFF) is 16⁶, which is the same value as 2²⁴. My assumption is that the operating system can somehow allow an application to use 24-bit addressing even though the operating system is using 32-bit addressing.

Why Reference Variables?? The Case of String

So, why do we have both literal variables and reference variables. A good place to start is looking at is Stings, since, uniquely, Strings can be both primitive variables and reference variables.

Stings as Literal Variables

Before going further, you need to appreciate that Strings are actually arrays of chars. If I have a String “John”, that’s actually a string of the characters ‘J’ ‘o’ ‘h’ and ‘n’. They are saved as an array of 16-bit chars, which is accessed via the String variable. Like all arrays, Strings will vary in length. “John” has a length of 4, but “Rayworth” has a length of 8. And that’s allright when a String is immediately assigned to a variable, since before the line is compiled, the computer will know exactly how much memory is needed.


String s1 = “John”;

With the above declaration of String s1, the computer will look for (4 x 16 = ) 64 bits of available memory, and place the four chars there.

Strings as Reference Variables

Before moving on, let’s jump back to arrays for a moment. What if we wanted to declare an array of ints? We’d go:


int[] intArray1 = new int[10];

When the array is made, it knows exactly how many bytes to reserve: 10 x 32 = 320, and theoretically could go about doing so.

But what about an array of Strings: how much memory to reserve for each of them? We’d go:


String[] stringArray1 = new String[10];

Hmmm… we want to reserve memory for 10 Strings, but how big will those Strings each be? How many characters each?? The answer is that we don’t yet know. So what we’ll do is have the Strings be reference variables. And note that even then, we won’t immediately be able to assign specific memory addresses, since we still won’t know which memory addresses will suffice the memory needs of each String. What we’ll do is give the 10 String reference variables a value of null. null will mean that they don’t yet “point” anywhere. null is actually 24 0s, and means “no address yet assigned”. Later on, the Strings of the array will be assigned actual values, and at that point the computer can go looking for available memory that will accommodate each String. It’s curious to note then that the String as reference variable is itself actually a memory address; the array of characters is at the memory address to which it points.

The Advantage of Passing Reference Variables as Arguments

There is another situation where making a String a reference variable will come in handy. And that’s when it is a very big String, and it will be passed as an argument to other methods. You may not have noted it before, but parameters are actually new local variables for their own method. So when you pass an argument to a method, the argument sent and the parameter received are two different things, and they both take up memory. Yes, they are initially the same value, but they are two different things. Note the following code:


public void receivingMethod(String s1){ 
      System.out.println(s1);
  }

public void sendingMethod(){
      String x1 = “Once upon a time there were three bears…”;
      receivingMethod(x1);
}

The two variables are two different things. So if we have a really big String, it ends up taking twice the memory when it is sent. You may be tempted to think that the x1 in the sending method is garbage collected before the s1 is used, but this is not the case; the receivingMethod must be fully executed before the sendingMethod gets to its ending brace. So while the receivingMethod is executed, both variables are in memory.

So here’s the better way to do it – with the String as a reference variable.


public void receivingMethod(String s1){
      System.out.println(s1);
}

public void sendingMethod(){
      String x1 = new String(“Once upon a time there were three bear”);
      receivingMethod(x1);
}

And now you may be tempted to think, yeah, but s1 is the “Once upon…” sentence, just like the x1 is, but, now that’s not correct. As a reference variable, the x1 is not the “Once upon…” sentence, rather it’s a reference to the sentence. And by passing it in the form of a reference, the receiving s1 parameter will also be a reference. The x1 and the s1 are two individual copies of the sentence’s address, so you’ve copied 24 bits, but you haven’t copied all the bits that make up all the characters of the sentence; there’s only one sentence, and both copies of the reference variable point to it. Nice.

And if you’re tempted by any critique at this point, it’s probably that it’s no big deal to copy one sentence in memory twice, especially with modern computers and lots of memory. But what if it wasn’t just one sentence, but a String of which was actually several pages of text? Then that certainly becomes a waste to double up on the memory when passing it to other methods.

The new Operator

It is the new operator that is used to make variables reference variables. You could read a line with new in it this way:


String s = new String();

“String s is assigned a new 24-bit memory address which will point to a String.”

You’ll remember that even to make int arrays we can do it the reference way, using the new operator, and so have it be a reference variable. What’s up with that, since all ints are the same size?


int[] intArray = new int[10];

Since we know that we need to reserve 10 x 32 bits for the array, why bother making the array itself a reference variable? The answer lies in the fact that we’ll very likely pass the array to other methods. And the data of the array could very well be a lot of memory. So, again, we won’t want to double up on the memory used. Therefore, unless you make an array the literal way, for example: int[] i = {1,2,3} then you’ll have the array as a reference variable.

Example:


public void receivingMethod(int[] s1){
      for(int i = 0; i < x1.length; i++){
            System.out.println(s1[i]);
      }
}

public void sendingMethod(){
      int[] x1 = new int[10000];
      for(int i = 0; i < x1.length; i++){
            x1[i] = i;
      }
}

To reiterate, in the above example, we pass the reference argument x1 to the receivingMethod. It copies that 24-bit reference as the s1 parameter. But the 10000 int array is not copied, though it can be used by both methods, since variables in both methods reference or “point” to it.

The Other Literals

Values that are not assigned to variables, but just used as-is, are also literals. So in the following:


System.out.println(“blah, blah, blah”);

int x = 3 + 2;

“blah, blah, blah”, and 3 and 2 are all literal values. But they are not assigned to variables. So we have literals, and literal variables, and we have reference variables.

Literal:


double d = 2.3;

- 2.3 is a literal value, and d is a literal variable.


String s = “Czech” + “Republic”;

- ”Czech” and “Republic” are both literal values, and s is a literal variable.

Reference:

 

String s = new String(“sdfasdfs”);

- ”sdfasdfs” is a literal value, but s is a reference variable.

Argument Passing By Value or By Reference

A lot of this will be review of things we have covered before, particularly in the last set of notes. But with this the difference between passing arguments as values and as references will be reinforced and made clearer.

Arguments Passed By Value

Clarification # 1: Argument vs. Parameter

Recall that when we want to “send” information from one method to another, the term we use for the information sent is argument, and the term for the information received is parameter.


public void methodSending(){
      methodReceiving(“Hello”);                 //”Hello” is the argument “sent”.
}
   
public void methodReceiving(String s){         //s is the parameter “received”.
      System.out.println(s);
}

Clarification # 2: Literal Value or Variable Sent

Something we should note in passing is that when we pass a primitive value as an argument, we will either pass the literal value itself or we’ll pass a variable, which is, in turn, a shortcut for a literal value. This is not a significant difference. The above example showed passing of a literal value itself. Following is a slightly different case where the primitive passed is actually a variable.


public void methodSending(){
      String s = “Hello”;
      methodReceiving(s);               
}

public void methodReceiving(String s){
      System.out.println(s);
}

The reason the difference between the two ways of passing a value is not so significant is that in both cases, what is “received” by the receiving method is a copy of a literal.

Clarification # 3: Copied, not “Sent”

Most often when we’re talking about arguments and parameters, we casually use the words “sent” and “received”. And most textbooks will use these words too. But, though it’s not the end of the world, this is not actually correct. When we “send” an argument, we are not “sending” it; it remains in memory where it was before. Meantime, the “receiving” method is therefore not actually “receiving”, but more correctly is copying what is “sent”. So a more correct way of putting it is that when we pass an argument, an exact copy of it is defined as a parameter in the receivig method.

The Implications of “Sending” Value Arguments

Since the argument and the parameter are two different things in memory, when one is changed, the other is not changed. This is particularly important to remember since it’s not the case when we are working with reference arguments/parameters – more on that later. Meantime, consider the following program segment which passes an argument as a value (assume methodSending is run):


public void methodSending(){
      String s = “***Hello***”;
      methodReceiving(s);
      System.out.println(s);
}
  
public void methodReceiving(String s){
      System.out.println(s + “printed from \”methodReceiving\” + “.“);
      s = “***World***”;
      System.out.println(s + “printed from \”methodReceiving\” after being changed.”);
}

Output:

            ***Hello*** printed from “methodReceiving”.
            ***World*** printed from “methodReceiving” after being changed.
            ***Hello***

In summary, with value argument passing, since the argument and parameter are two different things in memory, and they themselves are the values, if after the pass, one is changed, the other is not.

An Interesting Note About Other Programming Languages

Some programming languages allow arguments to be passed as either literal values or as references to literal values. For example, in C++, the & symbol is used to identify a parameter that is to be a reference, not a literal. So, in the C++ example:


void method1(int& x, int y)

the x& parameter would be passed a reference, and the y would be passed as a literal value.

Why would we want the choice of whether a parameter is value or reference? Well, there are times where you might want the original value passed to change as it’s “sister” parameter changes, and there other times where you want them to remain independent of each other. But the developers of Java kept things simple; primitives are sent as value, and objects are sent as reference. (In terms of what an object is, for now, let’s just keep it as anything made with the new operator.)

Reference Variable Passing

The concept of arguments being copied rather than sent still applies when passing reference variables. But what is copied is the reference, not the value that it refers to. Technically, you could say that a “value” is being copied, but that value is a value of the address of the object.

Recall that whenever we use the new operator, a reference variable is made. We’ve seen two cases of this so far, when making a String the reference way, and when making an array. We’ll use arrays for our example, since they are always reference variables, whereas Strings can be either literal or reference variables. Consider the following code segment. (Assume method1 is run.)



double[] d = new double[10000];
for(int i = 0;  i < d.length;  i++){
      d[i] = 334.679;
}

System.out.println(“Element 654 from method1 before call to method2:   “ + d[654]);
method2(d);
System.out.println(“Element 654 from method1 after call to method2:   “ + d[654]);

public void method2(double [] d2){
      d2[654] = 2.6;
      System.out.println(“Element 654 from method2:   “ + d2[654]);
}

Output

            Element 654 from method1 before call to method2:   334.679
            Element 654 from method2:   2.6:
            Element 654 from method1 after call to method2:   2.6

In the above example, we declare an array of doubles that is 10000 elements long. Then we assign the number 334.679 to each and every one of those elements. Next we call method2, passing our array to it. And then method2 does what it does, which is merely to change the 654^th element from 334.679 to 2.6. Pretty random chunk of code, but it’s easy enough to follow.

The point is that what we pass to - i.e. what is copied in - method2, is the reference to the array, not the array itself. And note that the great thing about this is that we don’t double the memory required for working with this array – which would be 64 bits x 10000 = 64000 bit, which is about 8 KiloBytes, and could make a difference to the speed of an application running on a small device. And imagine if the array we were talking about was all of the pixels of a 30 MegaByte high-resolution photo. Copying that just to pass the array of colors to a method that lightens it, for example, is simply unreasonable.

But, the main thing to note is that when something was changed in the array in method2, that change was made to the array pointed to from method1, because they are the same array, even though pointed to by two different copies of the same reference. See the diagram.

Reinforcement of Variables, Reference Vs. Value, & Scope

Variables Are a Value, and Have a Name.

A variable is a value. Yes, it has a name, and it is the name that we use when we are programming. But the variable name written in a program is no more a value than what I type next is an elephant: Elephant.

Variables Can Have Literal Values or Reference Values.

And a variable is always a value; sometime it is a literal value, and sometimes it is a reference value. If the literal variable x is 3, then it’s value is 3 and its name is x. If the reference variable y is “Hello”, then it’s value is whatever address “Hello” gets stored at, say 111AAA, and its name is y.

Variables Can Share the Same Name As Long As They Don’t Share the Same Scope.

Depending on where they are declared within a program, two or more variables could have the same name. But this is only permitted if they have different scope. If a variable of a certain name is declared within the scope of one method, then it’s Ok within the scope of another method for a variable to be called that same name. Though the practice of naming multiple variables the same name is not a good idea. Your variables should be precisely named, and unique.

The following will compile (and though garbage collection does take place anyway, a main reason it is Ok is scope):


public static void main(String[] args){
      method1();
      method2();
}

public static void method1(){
      int x = 0;
}

public static void method2(){
      int x = 99;
}

But the following will also compile - and garbage collection cannot be the reason – again, scope is the reason.


public static void main(String[] args){
      method1();
}

public static void method1(){
      int x = 0;
      method2(x);
}
 
public static void method2(int x){
      int x = 99;
}

In the above case, the x, which belongs to method1, is not yet garbage collected when method2 starts, but it is out of scope.

Global variables are the always an example of variables whose names cannot be used again elsewhere in that class. This is because they are defined within the braces of the whole class, so everywhere within the class those variables are still in scope.

The other time variables names cannot be re-defined is any time you have braces within braces, as commonly happens with loop and conditional structures. The following example demonstrates this:

public void methodWithLotsOfBraces(){


int x = 0;
if(true){
      int x = 99;  //Not permitted, x is still within scope.
      int x1 = 99;
      for(int i = 0;  i < 100; i++){
            System.out.println(x1);
            int x = 22; //Not permitted, x is still within scope.
            int x2 = 22;
      }
}

And note that the issue of scope usually doesn’t come up regarding the re-using of a variable name that’s already been used. (Again, you shouldn’t be doing that anyway; you should be using unique, descriptive names for each of your variables.) The issue of scope is usually that you can’t access the variable because it is out of scope.

So the scope question is: “Are we still within the braces where the variable was defined? If so, then the variable is still in scope, and can still be used.