Different search algorithms in data science

Swapnil Bandgar
6 min readMay 29, 2021

--

There are many useful search algorithms present which can fit different problems depending on the requirement we want to solve. They are divided in two main categories please find them as below:

There are different search algorithm which we can use when we have to identify different values in given problem or array.

· Linear Search

· Binary Search

· Jump Search

· Interpolation Search

· Exponential Search

· Sublist Search (Search a linked list in another list)

· Fibonacci Search

· The Ubiquitous Binary Search

· Recursive program to linearly search an element in a given array

· Recursive function to do substring search.

· Unbounded Binary Search Example (Find the point where a monotonically increasing function becomes positive first time)

Let’s explore the Greedy search in informed search algorithm.

Greedy is an algorithmic paradigm that builds up a solution node by node, always choosing the next node that offers the most obvious and immediate benefit. So, the problems were choosing locally optimal also leads to global solution are best fit for Greedy.

There are standard types in greedy algorithm which we can see as below:

· Activity Selection Problem

· Egyptian Fraction

· Job Sequencing Problem

· Job Sequencing Problem (Using Disjoint Set)

· Job Sequencing Problem — Loss Minimization

· Job Selection Problem — Loss Minimization Strategy

· Huffman Coding

· Efficient Huffman Coding for sorted input

· Huffman Decoding

· Water Connection Problem

· Policemen catch thieves.

· Minimum Swaps for Bracket Balancing

· Fitting Shelves Problem

· Assign Mice to Holes

Which we can use to solve standard type of problems, but when we have to use the graph to solve the problem, we refer to below greedy graph solution.

· Kruskal’s Minimum Spanning Tree

· Prim’s Minimum Spanning Tree

· Boruvka’s Minimum Spanning Tree

· Reverse delete algorithm for MST.

· Problem Solving for Minimum Spanning Trees (Kruskal’s and Prim’s)

· Dijkstra’s Shortest Path Algorithm

· Dial’s Algorithm

· Dijkstra’s Algorithm for Adjacency List Representation

· Prim’s MST for adjacency list representation

· Correctness of Greedy Algorithms

· Minimum cost to connect all cities.

· Max Flow Problem Introduction

· Number of single cycle components in an undirected graph

We will be using Dijkstra’s algorithm Shortest Path Algorithm to solve one of the examples.

Let’s consider Person A wants travel to Nashville from Memphis, find the cheapest path to reach there. there are lot of road or by air companies follow this to identify the cheapest route to reach.

Find graph below with different route and cost to reach different cities.

Dijkstra algorithm is used to solve single-source shortest-paths problem on a weighted and directed graph.

Let’s find the shortest path to reach Nashville from Memphis. We will use the table to identify and update relaxation.

Step 1) Start from Memphis, select minimum from Memphis which is highlighted below table (New Orleans), here minimum cost from Memphis is 3 which is to reach New Orleans.

If we go to Mobile from Memphis to New Orleans than to Mobile it will cost 6$ to reach. So, will update that in graph (Highlighted in yellow).

Step 2) Now select another vertex which will give minimum cost to reach next destination. As, we have solved for New Orleans now select another city which has less cost i.e., Mobile.

As going from Memphis to Savannah we are getting 12$ change which was previously unknown.

To reach Memphis to Mobile we need 6$ and to reach another 6 from Mobile so total will be 12$.

Step 3) Now let’s consider another city which has less $ cost that is Savannah.

To reach Atlanta we have below path:

1. We can directly take Memphis to Atlanta which will give us 10$

2. If we follow from Savannah to Atlanta which will give us 13$, But we already have value less to reach Atlanta 10$ so will keep it as is.

Step 4) Now we can select next city to reach next destination.

To reach Atlanta we have three paths:

1. We have already derived we have path to reach Atlanta that is Memphis to Atlanta which will cost 10$ which is less at the point.

2. If we select route Mobile to Atlanta, we might get to Atlanta in 8$ if we follow from Mobile to Atlanta. So, rewriting the parameters in table.

Step 5) Now let’s select Atlanta as vertex to solve next city problem.

As we already know to reach Nashville it is taking 15$, but if we follow route from Atlanta to Nashville it will cost 2$.

We already reached to Atlanta from Memphis with minimum cost 8$. So, we can rewrite the Nashville cost as (8+2) $ = 10$

So, we have solved problem using Greedy method’s Dijkstra algorithm which has given us minimum cost to reach Memphis to Nashville 10$. Our final table looks like below with start point is Memphis.

Complexity Analysis of Dijkstra algorithm:

Consider We have V number of vertices in a graph. Then by definition, there would be |V-1| number of edges. The main outer loop runs for |V| times. The inner loop meant where actual cost calculation happens, runs for |V-1| times for a complete graph as each vertex has |V-1| edges.

Also, for each iteration of the inner loop we do an extract Min and a decrease Key operation for the vertex.

Hence, the total running time will have an upper bound of O(|V| * |V-1|) which is equivalent to O(|V|2)

Our initialization requires a constant amount of work per node, plus we are considering all nodes time to time to identify best solution, which will give us O(N)O(N) nodes into priority Queue will take O(N*lg(N))O(N∗lg(N)) time. That’s O(N*lg(N))O(N∗lg(N)) time overall for all the initialization work.

We’ll update cost in table once per edge, or O(M)O(M) times. Each priority table update costs O(lg(N))O(lg(N)) time. That’s O(M*lg(N))O(M∗lg(N)) time overall.

Putting all the steps together, the time complexity for Dijkstra’s algorithm is O(N*lg(N) + M*lg(N))O(N∗lg(N)+M∗lg(N)). Sometimes, this complexity is written O((N + M)lg(n))O((N+M)lg(n)).

Time complexity: θ ((v+e) log v) = ((6 + 9) log6) = 15 log6 = 11.67.

--

--

Swapnil Bandgar

Code is like humor. When you have to explain it, it’s bad. Connect with me on LinkedIn : https://www.linkedin.com/in/imswapnilb