Introduction

In my view, it's unnecessary to know every algorithm and data structure, contrary to common belief. Instead, deep understanding of a select few is crucial. The rest you can learn when you need them.

So, what are the most important algorithms and data structures to know? I would say that the most important ones in a set $S$ are the ones that are essential to solve the most common problems in computer science and software engineering. Let's formalize this idea.

Let \ S = \{ \text{Data Structures and Algorithms} \}

You can think of $S$ as a set containing all data structures and algorithms. Now, we need to define a subset $E$ containing the essential structures and algorithms.

Let \ E = \{ \text{Essential DS and Algorithms} \}

For example, an essential subset of $S$ could be $E$ containing the following structures and algorithms:

E = \{ \text{AVL}, \text{BST}, \text{DAG}, \text{BFS}, \text{DFS}, \text{HT} \}

And, to use this subset, we need to define a set $P$ containing common problems in computer science and software engineering.

Let \ P = \{ \text{Common Problems} \}

For each problem $p \in P$ , there exists a data structure or algorithm $e \in E$ that solves $p$ efficiently.

\forall p \in P, \exists e \in E \quad \vert \quad e \text{ solves } p \text{ efficiently}

Please don't place excessive importance on this notation, as it serves merely as an example. It is evident that some problems require solutions through data structures and algorithms not encompassed within $E$ .

I will define what I mean by "efficiently" in this context. The efficiency is measured by the time $T$ and space $S$ needed to solve $p$ using $e$ . We define that $e$ is efficient if:

\begin{aligned} &T(e, p) = \mathcal{O}(log \ n) \text{ or } T(e, p) = \mathcal{O}(n) \\ &S(e, p) = \mathcal{O}(n) \end{aligned}

Where $n$ denotes the input size.

Note that $\mathcal{O}(n)$ is the worst-case time complexity, what is acceptable for most problems, since the average cases are better. This definition is not strict, but it provides a general idea of what I mean by "efficiently".

Typically, $E$ includes structures and algorithms efficient for common problems. For example, in the best and average cases, these structures and algorithms have a time complexity of $\mathcal{O}(log \ n)$ and a space complexity of $\mathcal{O}(n)$ . In the worst case, they have a time complexity of $\mathcal{O}(n)$ and a space complexity of $\mathcal{O}(n)$ .

DS / A	Insertion	Searching	Deletion	Traversal
AVL Tree	$\mathcal{\Theta}(log \ n) \quad / \quad \mathcal{O}(log \ n)$	$\mathcal{\Theta}(log \ n) \quad / \quad \mathcal{O}(log \ n)$	$\mathcal{\Theta}(log \ n) \quad / \quad \mathcal{O}(log \ n)$	$\mathcal{O}(n)$
BST	$\mathcal{\Theta}(log \ n) \quad / \quad \mathcal{O}(n)$	$\mathcal{\Theta}(log \ n) \quad / \quad \mathcal{O}(n)$	$\mathcal{\Theta}(log \ n) \quad / \quad \mathcal{O}(n)$	$\mathcal{O}(n)$
DAG	$\mathcal{O}(1)$	$\mathcal{O}(V + E)$	$\mathcal{O}(V + E)$	$\mathcal{O}(V + E)$
BFS	-	$\mathcal{O}(V + E)$	-	$\mathcal{O}(V + E)$
DFS	-	$\mathcal{O}(V + E)$	-	$\mathcal{O}(V + E)$
Hash Table	$\mathcal{O}(1) \quad / \quad \mathcal{O}(n)$	$\mathcal{O}(1) \quad / \quad \mathcal{O}(n)$	$\mathcal{O}(1) \quad / \quad \mathcal{O}(n)$	-

Therefore, focusing on the essential subset $E$ provides enough coverage to solve most common problems efficiently, eliminating the need to know all data structures and algorithms in $S$ .

References

Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.

Essential Data Structures and Algorithms for Efficient Problem Solving

Introduction

References