Exploring Arrays: Introduction

I am assuming that you are familiar with arrays/vectors, so what I gonna do here is explore some data structures that we use to deal with array problems. I will present here the famous string, the HashTable and the StringBuilder, then, in another post we will go through problems that make use of them.

String

I like to say that strings are like tomatoes, because it’s a fruit but people confuse with vegetables. In short, strings are char arrays.

This is why you can do this in most of the modern languages:

string myString = "Tomato";
char c = myString[2];
System.out.println(c);  // The output is "m"

Behind the scenes, Java, C# and other languages present you a String as a base class like “int” or “bool”, but it’s a class named “String” that step forward and do the dirty job. You may have noticed in C# for example, that if you type both “string” or “String” it will work, but the “String” will be presented you like a class.

So we can say that: “Tomato” = {‘T’, ‘o’, ‘m’, ‘a’, ‘t’, ‘o’}

HashTables

Hash tables are data structures that map values with keys. The goal here is to fetch a value using a key with a fast lookup time. In general, HashTables has a built in array and a hash function that compute the key.

HashTables are extremely useful when dealing with array problems, because we can use it as dictionaries.

The trick here is that HashTables are very efficient depending on the Hash function. The idea is that a hash function should allocate the value in a place that will never collide with another value.

Observe this code:

Hashtable<String, Float> shop = new Hashtable<String, Float>();
shop.put("grapes", 4.25f);
shop.put("strawberries", 1.7f);
shop.put("tomatoes", 3);

// Fetch the price of grapes
int comission = 5; // $5 dollars because I'm greed!
System.out.println(shop.get("grapes") + commision); // -> 9.25

The following thing is happening behind the scenes:

HashTable

The image above shows a simple hash function (the grey circle) that takes the first letter of a word as a key, and store the value inside the vector/array. So when we fetch a value on line 8, we use the same string as a key, then, the Hash function generate the key, and pick from the array.

Obviously, this is a ridiculous do-not-do-that implementation of a hash function, we are just using it to exemplify, you can foresee that if we add “tortilhas” to the list, it will cause a collision with tomatoes. Of course, we have more complex implementations of hash tables in the wild, but collisions still happens, and when it happens, HashTables start using Linked Lists, but this is a topic for another post 🙂

String Builder

This guy save lives when dealing with heavy string concatenation. The word of rule that I always say is: Do not concatenate strings! So what the hell should I do then?

Say you have this code:

public string mergeWords(String[] words)
{
   string mergedWord = "";
   for(String word : words)
   {
      mergeWord = mergeWord + word;
   }
   return mergedWord;
}

This type of code can be enough to eject you from your interview straight to the kids garden. Offended? There’s a reason, and you’ll glad to hit this post:

Remember on the beginning of this post that Strings are arrays of chars? When you use the “+” to concatenate a string into another, you are creating a third array with the size of the sum of your first and second array, and then, char by char, loop on each array and add to the third array, then, discard the previous two arrays.

Basically:

  1. “Tomato” + “es” =>
  2. new {null ,null ,null ,null ,null ,null ,null ,null};  =>
  3. “T” => {‘T’, null ,null ,null ,null ,null ,null ,null} =>
  4. “o” => {‘T’, ‘o’ ,null ,null ,null ,null ,null ,null} =>
  5. and so and so on
  6. {‘T’, ‘o’ ,’m’ ,’a’ , ‘t’ , ‘o’ , ‘e’ , ‘s’} => “Tomatoes”
  7. => Discard “Tomato” and “es”;

Seems a small operation, but imagine you concatenating a 50,000 document.

What a StringBuilder do is to keep an array of chars of a reasonable size, and storing every “put”, and then, just concatenate once when returning its value.

There are different approaches for increase this internal array when the time comes, some expand the array 100 by 100 slots every time it gets full, this “100” is called “buffer”. You can read this entry to know how to use a StringBuilder.

If you want to practice the use of arrays in the real world, I highly suggest you to implement your own version of StringBuilder. Here’s mine:

// This class wasn't tested, I just made it for fun straight on this post
// Do not use in production code!
public class StringBuilder
{
	private int buffer = 100;
	private char[] content;
	private int usedSlots = 0;

	public StringBuilder(int buffer)
	{
		this.buffer = buffer;
		content = new char[this.buffer];
	}

	public void Append(String str)
	{
		if ((content.length() - usedSlots) < str.length())
		{
			ExpandBuffer();
		}
		// append the string
		for (int I = 0; I < str.length(); I++)
		{
			content[usedSlot++] = str[I];
		}
	}

	private void ExpandBuffer()
	{
		char[] ncontent = new char[this.content.length() + buffer];
		System.arraycopy(content, 0, ncontent, 0, this.content.length());
		this.content = ncontent;
	}
}

Next on this same category, I will solve some interview questions and comment the “whys” and points you have to keep in mind when dealing with arrays.

This entry was posted in Data Structures and tagged , , . Bookmark the permalink.