Wednesday 2 April 2014

Entry #5: Sorting Blues

Sorting is an important thing in both life and computer science. It keeps things organized and therefore easier to find things.
Computer science has 5 main sorting algorithms; Bubble Sort, Selection Sort, Insertion Sort, Merge Sort, and Quick Sort. My friend Lindsay described how these sorts work in depth in her post and my friend Leon described the effiency in great detail in his post. So instead, I'll be breaking down the idea of sorting algorithms into simple English.

Note: A sorting algorithm is considered stable if it keeps the relative order.

Let's steal a highschooler's binder, any binder. You open it up and you'll probably notice a plethora of notes. Some pages will have notes that were taken in class, some pages will have small notes taken from the readings (that he was supposed to do weeks ago) , and probably pages filled with doodles done during classroom video time. And more importantly, you'll probably notice that not all the pages are in an organized manner. It starts off with notes about the Civil War then immediately jumps to a lengthly description of the Hiroshima bomb, a crude doodle of what may be a unicorn, and then detailed diagrams of the battle strategies of the Union's army.

So, for the sake of his sanity during exam time, this student is going to organize his binder. This brings us to the main part of this point, which sorting method will work best for him?
(Going with the assumption that he's sorting by chronilogical order, i.e. Civil War first, then World War One, then silly doodles in far back.)

Bubble Sort:
 Imagine if he used bubble sort through the whole thing. Going through the entire binder swapping pages, going through it over and over, not really knowing where pages are supposed to go. Of course this might seem okay if there's only a few pages but what if there were hundreds of pages? Would you really want to repeat this method like that? You know you're wasting time, the page you had just a few moment prior would have been perfect in this position but it's all the way back there. Would you really want to waste all that time? You could be studying, watching bad finales,

Now of course if this binder had been completely and perfectly sorted, you'd know it immediately. You go through it and you didn't have to make a swap you'll know perfectly well that everything is completely sorted.

Selection Sort:
So the student decides bubble sort is not the sort he wants to use. So he decides to change his methodology slightly. Instead of swaping pages everytime, he only swaps pages once. He keeps a tab of where the latest one is and siwtches tabs when he finds one bigger. When he gets to the part that he's already sorted, he'll swap the page before it was properly sorted with the one that he had tabbed.

Good job student. You've become slightly more efficent! But you're still not being your most efficient. You're swapping things. Swapping can be annoying. You'll have to keep a tab of where the page is supposed to go, put the page somewhere for safe keeping as you do all this flipping. Again, if you're looking at only maybe 3 sheets whatever, you're young, you're independent, you don't need no rules. But when you're looking at 500 + or even maybe 20+ yeah you might want to rethink your strategy...

Insertion Sort:
So the student has decided to scrap it all. Start fresh. So the smart student decides to sort it in parts. He pretends the first page is sorted, then he places the next page relative to the first page. Now these two make up the sorted pile and he continues this trend.

Yay he's learning! Though what if his notes were in backwards order? Then he'd have to go check every page to find the earliest. But on the plus side it's stable! So everything that has to do with the same time period will be in the proper order (i.e. If it wasn't stable his notes might look like: "World War Two was the Allies vs the Axis. Germany took over France. France was totally prepared for war!". When it is stable his notes would look like: "World War Two was the Allies vs the Axis. France was totally prepared for war! Germany took over France.")

Merge Sort:
Okay so the student was pretty smart thinking about insertion sort. But he could be more efficent. When the battle gets tough divide and conquer.

Take out all the pages, consistently split the pages in about half until there's no page left or only one page left.

The problem with this is that you'll need more space then the other two, since you'll need a place to keep all these sections. And this will get REALLY problematic when you have a LOT of papers and then you'll need a really big table. But it's going to be stable and it's worst case is actually pretty fast!

Quick Sort:
Isn't merge sort efficent? But this student's table isn't that big, if he tried to use merge all the papers would probablly fall off the table and everything gets ruined and breaks. So then he can settle for Quick Sort. It's a bit slower than merge but hey it takes half the space and not stable.

Take the first page and let that be your "pivot". Put the ones earlier than the pivot to the left and everything later than the pivot to the right.

This becomes as slow as bubble or insertion since there might be nothing smaller than your pivot so you'll consistently be sorting large lists. But this can be fixed by picking a random page as your pivot and hoping for the best!

And that's the end of the metaphor.
TLDR:
Bubble - Don't do it, unless you're list is almost sorted
Selection - It's all around better than bubble but not as good in the best case and just as bad in worst case
Insertion - Decently fast and stable, but only really ideal for small lists
Merge - Fast but takes up a lot of space
Quick - As a general faster than the first three, slower than merge and usually takes less space than merge

Sunday 23 March 2014

Entry #4 : Linked Lists

Linked List:The formal definition of a linked list is a data structure made up of nodes which, when put together, creates a sequence.
They are recursive by nature and because of its structure is capable of growing dynamically with no set size.
Linked lists are created through two types of nodes; Empty and Not Empty
 (*** Note: Empty nodes consists of None ***)
 A diagram that may be more helpful to seeing it would be a recursive data structure is the one below:

The biggest strength of linked lists opposed to other types is its lack of memory limit and of set sizes.

Take an array list for example. If you want to add something to an array list you need to copy everything to a different larger list. If you want ot delete something, everything behind that value would have to be moved up one.

Linked lists however can bypass this. This is best described by the diagrams below:
(*** Note: Green lines are original links, Red signifies broken or deletion, Purple shows the new links ***)

Hence, when creating things that require a lot of memory, like queues, linked lists are the best option.

Wednesday 12 March 2014

Entry #3: True CS Students Stay Awake at 2AM Drawing Tree Traversals

In light of E3A and a slight 2 A.M. recursion headache I am going to put a short sweet post about the different tree traversals. This is to serve as an organised reference, a stark contrast to the state of my room.

Def: Tree Traversal
  • the process of visiting/examining/updating each node in a tree data structure
  • there are multiple ways to traverse a tree
1.  Pre-Order:
    
         root -> left subtree -> right subtree
    def preorder

Things to Note:
  • Start from the top and go down
  • You hit everything in your path
  • Go all the way down the left and then you can start on the ones on the right that you didn't get

Analogy: kind of like when you're getting your 8 gym badges, very linear, you go forward not back

Code:
def preorder(tl: 'TreeList') -> list:
     return [] if tl is None else ([tl[0]]\
               + preorder(tl[1]) + preorder(tl[2]))












2. In-Order:

       left subtree -> root -> right subtree

    Things to Note:
  • Start at the bottom most left
  • Always visit "root" after the left child but before you go to the right child 
  • If there is no left child go straight to root
Analogy: The left is a cave (zubats...), the root is a healing centre, and the right is a field



Code:
def inorder(tl: 'TreeList') -> list:
    return [] if tl is None else inorder(tl[1])\
                                + [tl[0]] + inorder(tl[2])








3. Post-Order:
 left subtree -> right subtree -> root

Things to Note:
  •  Start at the most left
  • The main root will be your last hit
Analogy: Those "puzzles" where you have to take out all the underlings, starting from weakest to strongest, until you get to the final boss. (left side is weaker than right side cause right handed)

Code:
 def postorder(tl: 'TreeList') -> list:
    return [] if tl is None else postorder(tl[1])\                         + postorder(tl[2]) + [tl[0]]







 And that concludes 2 AM blogging and the plethora of bad Pokemon analogies.

Monday 3 March 2014

Entry # 2: Of Mountains and Monks


There's three morals to be taken from this story.

The first one is: don't ask monks on mountains for stories. Trust me I've been there, they make you pay and it's a waste of money.

The second one is, this entire story is a metaphor for recursion (albeit a badly written recursion).

To expand on this over stretched metaphor, recursion is a feature in programming that occurs when a function calls itself, like we see with the monk and the story. Now in real world, recursion is actually a lot more useful and less banal then what you see in this story. A good example of when recursion is awesome is factorials.

As my friend Lindsay had explained and shown in her slog entry, the use of recursion opposed to "normal" coding  is more efficient. From her example code, you can see that the use of recursion saved 4 lines of code.

The main downside of recursion, however, is that it takes up more memory and time as you're taking up more stack frames opposed to loops which clears the stacks in each iteration. Also, recursion isn't the easiest thing to wrap your head around it.

In my personal opinion, I don't believe recursion is terribly useful. I feel that saving memory space and time is more important then the elegance of a code or saving lines. Furthermore, it's very easy for one to write a recursive function that may never stop if they don't have full understanding of the recursion.

This connects to the third moral of our story, if you're writing recursion, make sure you write an exit.

Imagine if you will that the story never had an ending, that the original monk just kept talking about boys, monks, mountains and houses. Then the original boy would probably looks something like this by the time they get to the nth monk in the story:

And that concludes my long drawn out metaphor.

TLDR:
  1. Recursion is a function in programming that calls on itself
  2. It is a good alternative as it can shorten code lines
  3. Recursion often takes up more memory and time then normal loop methods
  4. Lest you want to sit around for something that never ends, one should always make sure that their recursion statement has an exit
  5. Monks need better story telling skills
This comic was drawn because the author just really missed drawing...

Tuesday 21 January 2014

Entry #1: Object Oriented Programing

Object Oriented Programming (OOP) is a model for writing computer programs. Unlike before, where programs were a list of instructions that acted on the memory in the computer, OOP focuses on objects interacting with other objects. These objects are usually instances of classes, self sustainable and are described by data fields. Many languages now utilize this model, notable examples being Python and Java.

But that's all just fancy words found on wikipedia or in some overpriced textbook, and it doesn't actually explain too well how it works. So to make it easier to understand, let's break it down.

First remember, OOP is objects interacting with each other, where these objects are instances of classes. So starting from the bottom, what is an object?

Object: 
Essentially, an object consists of state, descriptors (i.e. name, colour, etc), and behaviour (i.e. walking, calling etc.) The state of an object gets stored in variables while behaviours are exposed through functions.

In terms of materialistic things, an object is like an anything we can find today, like the coffee machine in the caf. It's states/variables would be it's available flavours, while the behaviours/functions would be getting coffee or something of the like.

Class:
If objects can be seen as everyday things that we use, a class can be seen as the blueprint. These blueprints would describe the details of the coffee machine, how it delivers coffee, what flavours, soy or skim etc. All coffee machines built from this blueprint would make those coffee makers and instance of the blueprint/class.

In terms of my own preference, I have none. There's obviously going to be a strong bias to the only model you've ever used and one you've only briefly read about in a wikipedia entry.

However, I do feel that it is easier to understand code in terms of objects opposed to the previous models, working on a list of instructions and acting on memory in the computer.