Back to the roots: .NET binary search and the meaning of the negative number of the Array.BinarySearch() return value
Recently I gave a group of developers a task witch can be simplified to following simple problem: you have a sorted array of elements; find the index of a given element in this array. They came up with following solution:
//given array
int[] sortedArray = new[] { 1, 5, 8, 12, 18, 20 };
//Create a list from the array
List<int> list = new List<int>(sortedArray);
//use IndexOf method to find the element
int result = list.IndexOf(8);
Of course it works find but didn’t we do too much work? Can we probably use the fact that the given array is sorted? Do we need an additional data structure List?
If we read the Performance Considerations section at MSDN class documentation we see that to perform IndexOf() operation the list first sorts the content and then performs the binary search on it.
For more information see: http://msdn.microsoft.com/enus/library/6sh2ey19.aspx
So we can simply use the BinarySearch() method of the Array static class.
//perform binary search
int result = Array.BinarySearch(sortedArray, 8);
So the BinarySearch() does basically the same as the IndexOf() does, but quicker, sometimes it’s better to remember basic algorithms instead of using complex data structures to accomplish the very narrow task efficiently.
For more information on binary search algorithm see: http://en.wikipedia.org/wiki/Binary_search_algorithm
Why does the BinarySeacrh() do more then IndexOf()
Let’s search a non existing element using both methods and compare results.
//given array
int[] sortedArray = new[] { 1, 5, 8, 12, 18, 20 };
//Create a list from the array
List<int> list = new List<int>(sortedArray);
//use IndexOf to find element
int result1 = list.IndexOf(17);
//Use Array.BinarySearch
int result2 = Array.BinarySearch(sortedArray, 17);
The result looks strange:
IndexOf(17)=1 BinarySearch(17)=5
Both of them deliver a negative value which should be interpreted as element not found. However in opposite to List.IndexOf() which always delivers 1 the Array.BinarySearch() returns some “magic” negative value. Following explanation can be found in class documentation:
Return Value
The index of the specified value in the specified array, if value is found. If value is not found and value is less than one or more elements in array, a negative number which is the bitwise complement of the index of the first element that is larger than value. If value is not found and value is greater than any of the elements in array, a negative number which is the bitwise complement of (the index of the last element plus 1).
For more information see: http://msdn.microsoft.com/enus/library/y15ef976.aspx
Sounds complicated but means following: the binary search travels the binary tree, at some point it realizes that element can not be found and stops on a certain node. If you negate the result of the binary search and subtract 1 you will get the index of this element.
Index

0

1

2

3

4

5

Element

1

5

8

12

18

20

So if you search for 7 the result will be the negated index of element 8 minus 1, equals 2.
So if you search for 17 the result will be the negated index of element 18 minus 1, equals 4.
The formula is:
 Array.BinarySearch(sortedArray, 22)  1
Important: if you are searching for a value which is greater then the last element this formula gives you array length, which is a non existing index. So you should not use the result of this calculation to access an element of the array without checking if it’s less then length.
So if you search for 22 the result will be 7. (7) 1 = 6 = array length.
The return value of BinarySearch() can be used in wide range of algorithms. For instance when solving the problem of vertex collision detection. It might be useful also when working with text segments.