Logout

Sorting Algorithms

With sorting algorithms we are able to sort arrays in a certain order, for example an array of ints from least to greatest, or an array of Strings alphabetically. Sometimes we want to look at or display at an array orgainzed in such a way. But by far the main reason sorting is so important to programming is that only with a sorted data structure can the binary search work. And the binary search is by far the most efficient searching algorithm we have at our disposal. The power of computers comes not through the sophistication of the mathematical calculation which can be done, rather through the fact that computers can work with so many things at once. And so since computers work with such vast quantities of data, that data is only useful if it can be quickly found. So searching is very, very important, meaning that sorting so that we can use the binary search is also very, very important.

"Algorithms"

First of all, you might be wondering what exactly an algorithm is, since the word is used frequently with reference to things like the bubble sort and binary search. And they are indeed examples of algorithms. In general an algorithm is a step-by-step solution to a problem. So Java algorithms are sequences of Java statements which solve a certain problem, or get some certain job done. But you can also think of an algorithm as being something non-programming language specific. So with the binary search algorithm, for example, in general it is to look at the middle element of a sorted list, and if the element you are looking for is not there, then ask if it is greater or lesser than the element you are looking for, and then ignore looking through the half which it therefore cannot be.

The following algorithms are variations from the sorts found at www.ensta.fr/~diam. and has a "very permissive" copyright via MIT.


4 (Increasingly Efficient) BubbleSorts

"Dumbest" BubbleSort: - sorts, but not efficiently

public void dumbBubbleSort(int[] intArray) {
     for (int pass = 0; pass < intArray.length - 1; pass++) {  
	    //The outer loop will achieve the bubbling up of the highest number each pass
           for (int i = 0; i < intArray.length -1; i++) {
           // The inner loop of "neighbor comparisons" get shorter by one each time (because of intArray.length -1)
               if (intArray[i] > intArray[i+1]) {  //The "neighbor comparison"
                    //Swap
                    int temp = intArray[i];        //To understand the need for temp, imagine trying to swap two glasses of
intArray[i] = intArray[i+1]; //juice...you will need a third glass.
intArray[i+1] = temp;
} } } }


Smarter BubbleSort 1, smarter because one less comparison each pass

The bubble sort compares one less pair ever pass, which makes sense since each time one more largest bubbles up to the end.
But it continues to do passes even if it becomes sorted before the last pass.

public void smarterBubbleSort1(int[] intArray) {
     for (int pass = 0; pass < intArray.length - 1; pass++) {
     //The outer loop will achieve the bubbling up of the highest number each pass
          for (int i = 0; i < intArray.length - 1 - pass; i++) {
          // The inner loop of "neighbor comparisons" get shorter by one each time (because of intArray.length - 1 - pass)
               if (intArray[i] > intArray[i+1]) { //The "neighbor comparison"
                    //Swap
                    int temp = intArray[i];  
intArray[i] = intArray[i+1];
intArray[i+1] = temp;
} } } }

 

Smarter BubbleSort 2, smarter because it ends when sorted

This bubble sort is more efficient because if it goes through comparing all pairs and no swapping is necessary, that means that it all must be correctly sorted, and it will stop.

public void smarterBubbleSort2(int[] intArray) {
boolean sorted = false;
while (!sorted) {
sorted = true; //this is just an assumption, which every time except the last, will actually not be the case //so the last time through the while loop, when we don't get into the swap at least once, //the while will no longer run
for (int i = 0; i< intArray.length-1; i++) { //(So this bubble sort inefficiently checks all neighbors each pass)
if (intArray[i] > intArray[i+1]) {
//Swap
int temp = intArray[i]; intArray[i] = intArray[i+1]; intArray[i+1] = temp;
sorted = false; //sorted goes back to false indicating that we still have to continue sorting
}
}
}
}

 

Smartest BubbleSort, because both one comparison less each pass, and ends when sorted

The following bubble sort combines both of the good features of the other two. It will only continue to go through the outer loop when there is still more sorting to do (accomplished by the while !sorted) And it compares one less pair each pass, in this case accomplished by the n variable that goes one down each pass.

public void smartestBubbleSort(int[] intArray) {
     int n = intArray.length;
     boolean sorted = false;
     while (!sorted) {
          n--; //It is the n which will result in one less comparison happening each outer pass;
               //whereas, with the first bubble sort we could use the 'pass' variable used for the for loop.
          sorted = true;  
          for (int i=0; i < n; i++) {
               if (intArray[i] > intArray[i+1]) {
                    int temp = intArray[i];  
                    intArray[i] = intArray[i+1];  
                    intArray[i+1] = temp;
                    sorted = false; //as in the second bubble sort, if swapping happens we'll want to continue, and so
                                    //with sorted re-set to false again, the while loop continues
               }
          }
     }
  }

(Detailed, precise) Explanation of the Smartest BubbleSort

The method takes in an array, and sorts it. It uses progressive passes through the array in which comparisons of neighboring elements are done, and swaps made where necessary.

A while loop continues the passes of comparisons and swaps down through the array as long as a boolean called sorted remains false. With each pass down through the array, an inner loop repeats the comparison of neigbouring elements, and whenever the first of any pair is greater than the second of the pair, the two are swapped. Over the course of a pass through the array, this will result in the largest as yet un-placed element to be "bubbled up" to its proper position. One less comparison is made each pass, since one by one the next largest elements get put into place; the one less comparison each time is accomplished by decrementing the loop control variable each pass (n--). Every time through the while loop, its loop control variable, a boolean called "sorted", is set to true, but changes to false when even one swap is made. The one time a pass through the array is completed and no swaps are made, that "sorted" variable remains true, thus making the condition for continuing the outer while loop to be false.

 

The Selection Sort

The first thing to notice with most selection sorts is that they work with finding the smallest, rather than the largest element, but this is not so significant a difference from the bubble sort. This efficient version of the selection sort is efficient because it only swaps values once every outer pass. So instead of "bubbling up" a value through a series of swaps, each pass it just keeps track of which is the smallest, and then swaps only once at the end of the full pass.

public void selectionSort2(int[] intArray) {
for (int i = 0; i < intArray.length-1; i++) {
int minIndex = i; // Assumed index of smallest remaining value.
for (int j = i+1; j < intArray.length; j++) {
if (intArray[j] < intArray[minIndex] ) {
minIndex = j; // Remember index of new minimum
}
}
if (minIndex != i) {
//Exchange current element with smallest remaining. //But note that this only happens once each outer loop iteration, at the end of the inner loop's looping
int temp = intArray[i];
intArray[i] = intArray[minIndex];
intArray[minIndex] = temp;
}
}
}

(Detailed, precise) Explanation of the Selection Sort

An outer for loop repeats passes down through the array. When each pass begins, it assumes the element where it starts contains the smallest element. One element at a time down through the array, we check to see whether that element is actually smaller than the one assumed to be. And every time that this is found to be the case, that newest smallest element is assigned to be the smallest - in fact, what we keep track of is the index of the element that is the smallest (with a variable minIndex). Upon reaching the end of each pass, we see if the smallest is actually the one we started with, or if that changed. If it changed, that means that the element we started with is not in fact in the correct position, rather, what we found to be smallest of the remaining should be there, so we swap the initial element with that one (now pointed to by the variable minIndex). Each outer loop pass starts at one element beyond the last time, since the smallest of the remaining elements is put in the proper place each pass.

 

 

Comparing Strings

The comparison of ints is straight-forward and can be done with simple greater than and less than operators (> <). But to compare Strings, you will need to use another of the String class' methods. We have already used equals. Now we will need to employ compareTo and compareToIgnorCase.

compareTo and compareToIgnore case are called by a String, and take in another String as a parameter. They return a number which will be the difference between the two Strings' ASCII values of their first characters. And if the first letters are the same, the calculation is done on the second letters and so on. In this way, if the first String is alphabetically smaller than the second String, a negative number is returned, and if the first String is alphabetically larger than the second String, a positive number is returned.

***If you are unsure of how characters are represented by numbers in the ASCII code set, see the "appendix" at the bottom of this page.

String s = "Chemistry";
String s1 = "Architect";
System.out.println(s.compareTo(s1));

Output: 2
(Since the ASCII value for 'C' is 67, and the ASCII value for 'A' is 65, and 67 - 65 = 2.)

String s = "Architect"; 
String s1 = "Chemistry";
System.out.println(s.compareTo(s1));

Output: -2
(Which is 65 - 67)

String s = "Chemistry";
String s1 = "Churchill";
System.out.println(s.compareTo(s1));

Output: -16
(Since the ASCII value for 'e' is 101, and the ASCII value for 'u' is 117, and 101 - 117 = -16. The 'C' and the 'h' are ignored since they are the same in both words.)

 

So in a sort or a search which is comparing Strings, we employ this fact. We know that if the returned value from compareTo is negative, the first String is "less than" the second. And vice versa, if the returned value is positive, the first string is "greater than" the second.

So if we wanted to achieve

if(StringA > StringB)

We would write:

if(StringA.compareTo(StringB) > 1)

 

And as you would expect, the difference between compareTo and compareToIgnoreCase is that upper and lower case don't matter in the comparison. So "aardvark".compareTo("Banana")

yeilds -1 just like "Aardvark".compareTo("banana") does, regardless of the small a and capital B or capital A and small b.

 

***ASCII appendix

ASCII stands for the American Standard Code for Information Interchange. It is the code set that virtually all computers in the world use to represent basic English letters and symbols, along with the digits 0 - 9. The basic ASCII set represents 128 characters, and is uniform throughout the world. The extended ASCII set of 128 additional characters is non-standard, and there exist many versions for various languages. The idea is that each character in the ASCII set is represented by an agreed upon combinations of binary 0s and 1s. So capital A is 0100 0001, capital B is 0100 0010, and so on. These binary "bits" represent one of two states, most often, in a computers, an electrical switch on (1) or an electrical switch off (0). When the binary number is converted to decimal, you get the standard decimal equivalent of the characters. For example, capital A (binary 0100 0001) in decimal is 65, and capital B (binary 0100 0010) is 66, and so on.

Here is a link to an ASCII talbe showing the first, standard, 128 characters, along with their binary value and decimal equivalent.