Post It Notes at the Office

How to Write Efficient Code (by example) Part 1: Databases and Loops

October 26th, 2015 Posted by Website Development 0 comments on “How to Write Efficient Code (by example) Part 1: Databases and Loops”

TL;DR – How to write efficient code when using loops

Bring as much code outside the loop as possible, especially lines that are computationally expensive (such as querying the database) because loops are O(n) “big-oh of n” and need to be run once for every input item, whereas code outside a loop is O(c) “constant time” and take the same amount of time regardless of input size.

Background

One of the most beneficial classes I took in college was Analysis of Algorithms. In essence the class taught us what inefficient code was and how to write code that was not inefficient – this is a super valuable skill I’ve found that freelancers without college degrees seem to be lacking. But it’s not as complex as you might think. In this article, I will attempt to explain the core concepts I retained from the class, and show you how you can use them to avoid spending hundreds of hours rewriting your own code.

Some points before we begin:

  • This applies to all programming languages, because it has to do with algorithms, not programming
  • All code examples are written in PHP because I know the language
  • We will use the variable N to mean the number of items in our dataset. For example if N = 100, we have 100 teams in our database table below.

Example: Displaying Data from a Database

Problem: You have a database with a list of teams, and you want to print the names of the teams to the screen.[/vc_column_text][/vc_column][/vc_row]

Example 1: Inefficient Solution

One way to solve this problem is to simply loop over the table and display information as we go until we reach the end.

// prepare the statement
$statement = $db->prepare("SELECT name FROM teams WHERE id = ?");
$statement->bind_result($result);
$statement->bind_param("i", $id);

//set initial variables
$id = 1;
$success = TRUE;

//execute, fetch & print until no more results
while($success == TRUE){
  $statement->execute();
  $success = $statement->fetch();

  echo $result . "\n";
  $id++;
}

 

Click here if you don’t know about using SQL prepared statements

This code will get the job done. However, when we look at the time it takes to run, the time and resources it takes to execute this algorithm grows proportionally to the input at unsustainable rates! In other words, if the number of rows in the database table (n) doubles, this algorithm will roughly double in the amount of time, memory, or processing power it will take to run, and after n > 10,000 or so, this algorithm is taking over 1300 milliseconds to perform, which is far slower than today’s standards. When tested at 100,000 records, this algorithm took a whopping 20 seconds to perform! It helps to make a graph of how the numbers increase (see below)

 

Running Times  for the Inefficient Solution

The execution time doubles when the number of rows in the database table (n) double. At n = 10,000 it takes 1.5 seconds to execute the script, and at n = 100,000 it takes over 20 seconds to execute!

n time (ms)
100 47
1000 230
2000 417
3000 538
4000 772
5000 883
6000 991
7000 1124
8000 1188
9000 1338
10000 1515
100000 20261

Why is this algorithm so inefficient? The $statement->execute()  line is inside the while() loop, causing the script to query the database (very high cost) every single time it needs to access a new record. This is fixed below to show a mega improvement.

Example 1: Efficient Solution

Here is a much better way to solve the problem! Instead of querying the database each time for every data point, now we are querying the database once and the only thing we have in our while loop is the fetch and print commands (much lower cost).

// prepare the statement
$statement = $db->prepare("SELECT name FROM teams");
$statement->bind_result($result);
$statement->execute(); //<-- Now this statement only executes once

//set initial variables
$success = TRUE;

//fetch & print until no more results
while($success == TRUE){
 $success = $statement->fetch(); //<-- loads data into a variable 
 echo $result . "\n";
}

Now that the execute line is taken out of the while loop, we are querying the database only once. We can see that there is about a 5x increase in performance from this one change in approach! That could mean the difference between your users waiting five seconds to refresh a page and waiting only one. This could mean the difference between a website that “feels” quick, responsive, innovative, and efficient and one that feels like it’s 10 years old.

Running Times  for the Efficient Solution

Instead of 1.5 seconds to display 10,000 records we now take 350ms. This will be noticeable (users tend to notice anything over 150ms) but it is a vast improvement over a 1.5 second hang time.

n time (ms)
100 47
1000 57
2000 99
3000 151
4000 161
5000 186
6000 275
7000 279
8000 298
9000 321
10000 351
100000 2080

 

 

Delving a little deeper: Analysis of Algorithms

 

As you may have noticed, both of these graphs look exactly the same. The only difference is the numbers on the scale on the left. There is one important conclusion to make from this:

As the input grows, both solutions will still steadily increase in the amount of time it takes to run the script. At 1 million records in a database, both of these solutions will take longer than a user is willing to wait. Both of these solutions are inefficient because they grow linearly.

This is because both of these solutions are classified as O(n) – pronounced “big-oh of n” – algorithms. This means that the cost increases at a linear function of n. To put that in perspective, code run outside a loop is O(c) “constant time” and take the same amount of time regardless of input size. Code that is in a nested loop is O(n²) “big-oh of n squared” and take geometrically more time as the input expands. This classification is extremely important to understand and is at the core of writing efficient code.

Stay tuned to the rest of the How to Write Efficient Code (by example) series to learn more about efficient code, how to analyze it, and why it is important.


[/vc_column_text][/vc_column][/vc_row]

Call us now to talk over your project and

get an expert’s opinion