I've been working on web based projects built mainly with PHP and JavaScript, where I mostly use Zend Framework and jQuery. I am interested in any webpage optimizations techniques - for a faster web! Stoimen is a DZone MVB and is not an employee of DZone and has posted 96 posts at DZone. You can read more from them at their website. View Full User Profile

Algorithm of the Week: Sequential Search

11.29.2011
| 11990 views |
  • submit to reddit

This is the easiest to implement and the most frequently used search algorithm in practice. Unfortunately the sequential search is also the most ineffective searching algorithm. However, it is so commonly used that it is appropriate to consider several ways to optimize it. In general the sequential search, also called linear search, is the method of consecutively check every value in a list until we find the desired one.

Basic Implementation

The most natural approach is to loop through the list until we find the desired value. Here’s an implementation on PHP using FOR loop, something that can be easily written into any other computer language.

// unordered list
$arr = array(1, 2, 3, 3.14, 5, 4, 6, 9, 8);
 
// searched value
$x = 3.14;
$index = null;
 
for ($i = 0; $i < count($arr); $i++) {
	if ($arr[$i] == $x) {
		$index = $i;
	}
}
 
if (isset($index)) {
	echo "The value $x exists in the list on position $index!";
} else {
	echo "The value $x doesn't appear to be in the list!";
}

This is really the most ineffective implementation. There are two big mistakes in this code. First of all we calculate the length of the list on every iteration of the array, and secondly after we find the desired element, we don’t break the loop, but continue to loop through the array.

Forward Linear Search

A common mistake is to continue the search even after we've found the desired value!

Yes, if the element is repeated without the “break” we can find its last occurrence, but if not the loop will iterate over the end of the array with no practical value.

Optimization of the forward sequential search

// unordered list
$arr = array(1, 2, 3, 3.14, 5, 4, 6, 9, 8);
 
// searched value
$x = 3.14;
$length = count($arr);
$index = null;
 
for ($i = 0; $i < $length; $i++) {
	if ($arr[$i] == $x) {
		$index = $i;
		break;
	}
}
 
if (isset($index)) {
	echo "The value $x exists in the list on position $index!";
} else {
	echo "The value $x doesn't appear to be in the list!";
}
Optimized Forward Linear Search

It's imporant to break the loop once you've found the value in the list!

Even with this little optimization the algorithm remains ineffective. As we can see, on every iteration we have two conditional expressions. First we check whether we’ve reached the end of the list, and then we check whether the current element equals to the searched element. So the question is can we reduce the number of the conditional expressions?

Searching in reverse order

Yes, we can reduce the number of comparison instructions from the forward approach of the linear search algorithm by using reverse order searching. Although it seems to be pretty much the same by reversing the order of the search we can discard one of the conditional expressions.

// unordered list
$arr = array(1, 2, 3, 3.14, 5, 4, 6, 9, 8);
 
// searched value
$x = 3.14;
$index = count($arr);
 
while ($arr[$index--] != $x);
 
echo "The value $x found on position " . ($index+1) . "!";

Note that we need to adjust index because of $index—expression.

Indeed here we have only one conditional expression, but the problem is that this implementation is correct ONLY when the element exists in the list, which is not always true. If the element doesn’t appears into the list, then this code can lead to an infinite loop. OK, but how can we stop the loop even when the list doesn’t contain the desired value? The answer is, by adding the searched value to the list.

Sentinel

The above problem can be solved by inserting the desired item as a sentinel value. Thus we’re sure that the list contains the value, so the loop will stop for sure even if at the beginning the value didn’t appear to be part of the list.

Sentinel Sequential Search

By inserting a sentinel at the end of the list, we're sure the value exists in the list!

// unordered list
$arr = array(1, 2, 3, 3.14, 5, 4, 6, 9, 8);
 
// searche value
$x = 3.14;
$arr[] = $x;
$index = 0;
 
while ($arr[$index++] != $x);
 
if ($index < count($arr)) {
	echo "The value $x found on position " . ($index - 1) . "!";
} else {
	echo "The value $x not found!";
}

This approach can be used to overcome the problem of the reverse linear search approach from the previous section.

Complexity

As I said at the beginning of this post this is one of the most ineffective searching algorithms. Of course the best case is when the searched value is at the very beginning of the list. Thus on the first comparison we can find it. On the other hand the worst case is when the element is located at the very end of the list. Assuming that we don’t know where the element is and the possibility to be anywhere in the list is absolutely equal, then the complexity of this algorithm is O(n).

Different cases

We must remember, however, that the algorithm’s complexity can vary depending on whether the element occurs once.

Is it so ineffective?

Sequential search can be very slow compared to binary search on an ordered list. But actually this is not quite true. Sequential search can be faster than binary search for small arrays, but it is assumed that for n < 8 the sequential search is faster.

Application

The linear search is really very simple to implement and most web developers go to the forward implementation, which is the most ineffective one. On the other hand this algorithm is quite useful when we search in an unordered list. Yes, searching in an ordered list is something that can dramatically change the search algorithm. Actually searching and sorting algorithms are often used together.

A typical case is pulling something from a database, usually in form of a list and then search for some value in it. Unfortunately in most of the cases the database orders the returned result set and yet most of the developers perform a consecutive search over the list. Yet again when the list is ordered it is better to use binary search instead of sequential search.
Let’s say we have a CSV file containing the usernames and the names of our users.

Username,Name
jamesbond007,James Bond
jsmith,John Smith
...

Now we fetch these values into an array.

// work case
$arr = array(
	array('name' => 'James Bond', 'username' => 'jamesbond007'),
	array('name' => 'John Smith', 'username' => 'jsmith')
);

Now using sequential search …

// using a sentinel
$x = 'jsmith';
$arr[] = array('username' => $x, 'name' => '');
$index = 0;
 
while ($arr[$index++]['username'] != $x);
 
if ($index < count($arr)) {
	echo "Hello, {$arr[$index-1]['name']}";
} else {
	echo "Hi, guest!";
}

Note: All the examples in this article are written in PHP.

 

Source: http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/

Published at DZone with permission of Stoimen Popov, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Mason Mann replied on Tue, 2011/11/29 - 3:53pm

"Unfortunately the sequential search is also the most ineffective searching algorithm. "
Only a true academic makes such bold claims.
For a large class of inputs and scenarios, it's by far the fastest because of cache locality and prefetching (a truth that extends to hard drives.)
You'd be surprised how shitty binary search can be because it completely trashes the cache, so you need a fairly large input before the log(n) behavior gets interesting.
" but it is assumed that for n < 8 the sequential search is faster."
n can be waaay bigger.

Travis Romney replied on Tue, 2011/11/29 - 4:58pm in response to: Mason Mann

An array is almost always stored in memory, not disk. n < 8, holds true for the vast majority of array implementations.

Mason Mann replied on Tue, 2011/11/29 - 5:52pm in response to: Travis Romney

Uh, you miss the point. CPU's have caches too, and a cache miss is incredibly expensive. Judy arrays beats any old hash implementation, simply because it does linear searches whenever data is in the cache. That's how much it matters.

John David replied on Wed, 2012/01/25 - 8:07pm

unfortunatially sequential search has cost issues as compared to other search algorithms like binay search or index searched.

But if you have a sorted list of data, I think it is very useful. 

new java

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.