24
Aug

Count occurrences of a number in a sorted array with duplicates using Binary Search


In this lesson, we will solve a very famous
programming interview question and the question is that
given a sorted list of integers, we want to
find out how many times a particular element
occurs in this list. So let’s say we are given list in the
form of an array. Here we have an array of size twelve and the elements are in increasing order.
Now how many times does number five occur in
this array? 5 occur five times and how many times does number two
occurred in this array? 2 does not occur in the array. So we will be given any such sorted array A and a number X and we have to find out
how many times this number X exists in this array A. Now we want to solve this problem programmatically. So let us think through
the different approaches that we may want to follow. The simplest approach is
that they can scan the whole array and count the occurrences of X in the array. So if I have to write a
function FindCount() that we will take as argument the array, its size n and element X to be searched
for. The element X for which we want to find the
count. so the logic would be pretty
straightforward, we will take a variable initially, let’s say the name of the
variable is count. And we initialize it to zero and then we run a loop starting 0 till n-1 and if A[i] or the element at index i is equal to element X,we increment count so count
becomes count+1 and finally when we come out of this loop, we return count. So we are
performing a linear search where we are scanning the whole array to search for element X. We could optimize this algorithm a
little bit, something like this because the array is sorted, once we reach the size when A[i] becomes greater than X, we can stop counting but still in the
worst-case this loop will run and n times when all the elements
in the array are same. then this loop will run n times, so in
the worst case this algorithm, the runnning time of this algorithm is
proportional to N. In other words the time complexity is be O(n). This is a very simple solution for this problem
and if you give this solution in a programming interview, the interviewer would be like “I don’t
like this!” “Give me something better”. So how do we find out a better solution? Whenever in a
problem we are given a sorted data or a sorted collection, we
should try to think about applying one very
famous algorithm that makes the best use of this property that the data is sorted
and this algorithm is binary search. So can we make use a
binary search in this problem?. One thing that you can do is that using
binary search we can find out any occurrence of
the number X in the array. So let’s say in this example
we want to find out count of number 5. So let’s say we
will find out using binary search in O(logn)
time that number 5 exists at index 6. Now because the array is sorted, all
occurrences of 5 will be adjacent to this. So we
can go towards higher indices, starting this index and look for all the 5 and then we
can go towards the lower indices, and look for all occurrences of 5 but
once again if the whole array is number 5 only all
elements being same still is a sorted array, then we will
scan all the elements, we will access all the
elements in the array and eventually the time complexity will
be O(n) only. The time taken will
be proportional to n only in the worst-case. So using binary search kind of does
not give us much advantage, if we use binary
search in its basic form. O(n) because to perform
binary search we will take O(logn) then to find out all the adjacent
occurrences of X we will take O(n) in the worst case
so for higher values of n log(n) is negligible in comparison to n, so this is eventually O(n). We are not writing, writing pseudo code for this
approach. We will that as an exercise for you. With these two approaches we are still O(n)
in the worst case. so what do we do?. Well if you’d remember from our previous lessons on binary
search, we can write a binary search to find out the first
occurrence of an element in an array and similarly we can write a variation
of binary search to find out last occurrence of an
element in an array. and this forms the basis of our third
approach. And I’ll clear some of this and make
some space. Okay so we can use one variant of binary search to find out the first occurrence of an element in array
and we can use another variation of binary search to find out the last occurrence and if we know the last and the first
index at which the element occurs then we also know the
count of it in the array. So once again we will write a method
FindCount() that will take an array A, its size n and element X and let’s say we find the first occurrence of the element in the array,
using the method. FindFirst() which is a variation of binary search and we will use another method call to another variant of binary search that will give us the last
occurence of the element in the array then we can return count as last
index – first index + 1, and I’m not handling
the case here when the element is not present in the array,let’s say we will handle it ,well in our actual
implementation. Okay, so that first method call if we are
using binary search will work in O(logn) and the second
method call to find out the last occurence will also work in
the O(logn). So overall the time complexity to find
out the count of an element in the array would
be O(logn). and that’s really great. We have
described how to find out the first or the last occurrence of an element in a
sorted array using binary search. We had written the
pseudocode for the algorithm in our previous
lesson and there is a link to the previous lesson in
the description of this lesson but let us now go and write some real code to solve this problem. I will
write a C program.let us first write a
simple normal binary search and then they will modify it to find the
first or the last occurence, let’s say we have a method BinarySearch() that gives me index of element X in the array. In binary
search we first define 2 indices low
and high to variables low and high initially set to 0 and n-1 respectively and then
we find the mid element as low+high/2 and then we compared the middle element
with the number X and if we if middle element=X, we have found
our elements so we simply return the index mid, else if X

Tags: , , , , , , , , , , , , , , ,

41 Comments

  • Ritesh Kasat says:

    nice

  • Anil kumar says:

    .alot informative 🙂
    perfect explanation of every topic !! hats off for your recommendable work ..
    i found these videos best !!! amongst all cyber tutorials…

  • Kulamani Sahoo says:

    am a Big Fan of U r Video Series…Simple CLear and to the Point ina ll Sorting Video's..But it will be Good If you compare the Time and space complexity for all the sorting in one Video…Also want to Know why Quick Sort not Merger sort?

  • mycodeschool says:

    Hi Kulamani,
    Quick sort does not need any extra memory while merge sort needs extra memory proportional to the size of array to be sorted. That's why quick sort is preferred over merge sort. But there may be scenarios where merge sort would be preferred. It all depends upon the scenario.

  • v9 says:

    very clear explanation. thanks

  • coding math says:

    You forgot to check if lastIndex == -1 (element appears only once).
    You can't return last-first+1  if last is -1. In that case you should return 1 since first index was not -1.

  • Andrew Stauble says:

    THANK YOU.

  • Anuj Gupta says:

    These videos are very helpful and informative. The guy explains wonderfully without rushing into anything. Examples are good and the subtitles are accurate. Thank you mycodeschool for coming out with such good videos. Kudos to you and your team.

  • A S Gowri Sankar says:

    Thanks as always for clear explanation.

  • Hengame Hoseini says:

    Thanks for clear explanations

  • Андрій Б'ялик says:

    i've done it with xor.

  • Mihai Cosmin Bianu says:

    what if we want to determine the number of duplicates, How do we write the code?

  • SVince Onal says:

    how about if float array?

  • Sunny Meska says:

    i am going to infinite loop when i am running this program please help me

  • arun sharma says:

    is bool data type supported in C ?

  • Mohamed Ali says:

    man I LUVV YOUU

  • Mu Tian says:

    This is awesome!

  • alok porwal says:

    the way you speak is fantanstic!Kudos.

  • Nitesh Mishra says:

    what if similar elements are not in contiguous location?
    how will calculate count?

  • RAGHAV CHADHA says:

    implementation in java
    import java.util.Scanner;

    public class tofindduplicatesinarrayusingbinarysearch {

    public static void main(String[] args) {
    Scanner sc = new Scanner(System.in);
    int arr[] = {2, 10, 10, 10, 20, 30, 50, 50};//works for only sorted array
    System.out.println("enter element to be searched");
    int x = sc.nextInt();
    int firstindex = binarysearch(arr, x, true);
    if (firstindex == -1) {
    System.out.println("count of" + x + " is 0");
    } else {
    int lastindex = binarysearch(arr, x, false);
    System.out.println("count of" + x + " is" + (lastindex – firstindex + 1));
    }

    }

    public static int binarysearch(int arr[], int x, boolean searchfirst) {
    int low = 0;
    int high = arr.length – 1;
    int mid;
    int result = -1;
    while (low <= high) {
    mid = (low + high) / 2;
    if (x == arr[mid]) {
    result = mid;
    if (searchfirst) {
    high = mid – 1;//go on searching towards left//first most occurence
    } else {
    low = mid + 1; //go on seraching toward right//last most occurence
    }
    } else if (x < arr[mid]) {
    high = mid – 1;
    } else if (x > arr[mid]) {
    low = mid + 1;
    }
    }

    return result;
    }

    }

  • Mohamed Khafaga says:

    You really are a great teacher

  • shikha tewari says:

    Where were you my entire life? 😫🤦

  • Kamalesh Neerasa says:

    I want to check whether an array is super array or not.
    A SUPER ARRAY is an array which contains each element repeated its number of times.
    like 1224444 is asuper array because 2 is repeated twice and four is repeated 4 times.
    can you help me with algorithm.

  • Megha Syam says:

    what if the duplicates are not in continuous indices in the array…..for example
    A[ ]={1,2,3,4,5,6,4,7,8,4,9,4,10,4};
    Look at this now.. we have element 4 repeated but not continuously …..

  • ri wen says:

    easy to understand. thank you so much <3

  • Aleksandar Miladinović says:

    Hi. I have a question to find in unsorted list of integers 2nd anth 5th number six in Array and calculate the sum(+) of all the numbers in between those two sixes. If you could help me that would be great.Thank you

  • Bipul Kumar Sharma says:

    nice

  • Michael Salo says:

    The +1 is what would trip me in an interview. I would be thinking, I just need the difference.

  • Jakaria Ridoy says:

    Why using "firstIndex + 1" when subtracting from "lastIndex"?
    Thank You.

  • Swathi K says:

    sir can you do a lecture on prims algorithm

  • Idiots Uo says:

    Sir I'm using shipping desktop applications and import excel file in application show message
    Sorry this id already added duplicate value index was out of range. Must be non negative and less than size of the collection.
    Parameters name index

  • George Z says:

    how would you have it read from a txt file and give you the 10 most common numbers?

  • jahidul islam says:

    help me someone

  • Ahmid libya says:

    this algorithm have a major bug if there only 1 element in the array running first and last will return the same index and creates a false count there must be a case
    if (first==last)return 1;
    or when searching for the last duplicate pass the array partition from last+1 to array.length-1

  • ANCHIT BHUSHAN says:

    public static int Count(int[] a, int low, int high, int key){
    if(low>high) return 0;
    int mid = low+(high-low)/2;
    if(a[mid]==key) return 1+Count(a, low, mid-1, key)+Count(a, mid+1, high, key);
    if(a[mid]>key) return Count(a, low, mid-1, key);
    else return Count(a, mid+1, high, key);

    }

    This is the recursive way to calculate number of occurence which he was talking at @3:02, but still it has O(n) worst case complexity

  • Sami James says:

    The search for First and Last sections of this code look linear to me (O(n) and not O(Log(n)) , because you're decrementing/incrementing the boundary by 1 at each iteration

  • Sameer Giri says:

    thanks! you're better than most professors at explaining these 🙂

  • Lakshmikanth Ayyadevara says:

    #another algorithm of finding

    a=[1,1,1,1,1,1,1,1,1,1,2]

    l=len(a)

    count=1

    y=int(input('select an integer'))

    for i in range(l):

    if a[i]==y or a[l-i-1]==y:

    if a[i]==y and a[l-i-1]==y:

    print(l-1-2*i+count)

    break

    elif a[i]==y:

    if a[i]!=a[i+1]:

    print(count)

    break

    count+=1

    elif a[l-i-1]==y:

    if a[l-i-1]!=a[l-i-2]:

    print(count)

    break

    count+=1

    Sir i like your process a lot can you check this algorithm and mention which one i better one sir

  • Balasubramanian T.K says:

    if input is 4 4 4 4 and searched for 4 it will show in location 2, why?

Leave a Reply

Your email address will not be published. Required fields are marked *