As you know from the title, this book covers a wide range of three topics, in fact, too wide range of three topics, on which you will find books separately. You will find many good books on Discrete Mathematics, you will find many good books on Algorithm and Data Structures, as well.
Keeping that fact in mind, we are writing a book which tries to discover the relations between three well known topics, and, moreover, how they influence each other.
We can compare this relationship, such as, a relation between the writer and readers. Readers always reconstruct a book while reading, and they write it in a new way in their mind. Each time a book is read, it is actually written again.
We can consider this book as a confluence of the three distinct topics, such as Discrete Mathematics, Algorithm and Data Structures.
Let us start with Discrete Mathematics first.
What is Discrete Mathematics? We need to know that first. Second, we need to know what is the relationship of Discrete Mathematics to Computer Science. Finally, we want to know how we can implement the Discrete Mathematical concepts into various fields of Computer Science, such as Algorithm and Data Structures.
Here, we are primarily concerned with Algorithm and Data Structure. These topics sit at the core of any Computer Science curriculum. Without understanding these two components properly, we can not claim that we are students of Computer Science!
Data Structures are most fundamental and building blocks of computer science. Understanding Data Structures is an essential thing to design and build efficient software. At the same way, we can say, that knowing Algorithm is also compulsory to understand these fundamental blocks of Computer Science.
Therefore, I have chosen these two topics and try to implement Discrete Mathematical concepts into them so that our understanding might take a proper shape.
In this chapter, our first and foremost concern is to know what is Discrete Mathematics. However, before starting the book, there is an important question.
Is Discrete Mathematics enough to study Computer Science?
Our answer is – NO.
There are some other topics that we must know – now, simultaneously, or later, to get a proper understanding of Computer Science. If we love our subject, then we will try to understand how Calculus, Linear Algebra and as a whole, Mathematics is related to Computer Science.
We are not pushing ourselves to become a mathematician. That we are not going to suggest.
We want to study computer science.
Nevertheless, we also want to know all the necessary concepts that are related to computer science.
Before moving on to our real discourse, let us know what the word – Mathematics – means to us. Mathematics includes the topics, such as quantity (number theory), structure (algebra), space (geometry), and change (mathematical analysis).
As we see, as a programmer, we also deal with such topics. We deal with numbers, all types of numbers, so an introduction to number theory should be included in our study. We need to study structures (remember data structures); therefore, a knowledge of algebra is important where calculus and linear algebraic equations might play a vital role.
Therefore we need to know how to find area and volume of two moving objects, how to find the slope of a line, or need to know about linear algebraic concepts like matrices and vectors.
On the other hand, Computer Programming involves tasks like data analysis, generating algorithms, making algorithms more accurate with less resource consumption, and finally we need to implement algorithms in a chosen programming language. These processes, as a whole, are known as coding. Knowledge of specialized algorithms and formal logic may be enough to be a good programmer.
However, if we want to change the word – good – to another word – great – then we must equip ourselves with some mathematical knowledge that are not directly related to Discrete Mathematics.
Although they are not directly related, they are indirectly related. The study of calculus starts with the relationship between velocity and distance; and, they are ordered pair of sets, which is very much a discrete mathematical conceptions.
Calculus is a mathematical study of continuous change; however, computer scientific study is severely discrete, it deals with distinct objects.
Besides, linear algebraic conceptions are fundamental in many areas where line, plane and spaces are involved. We need to know them as well. If not now, then we should know them later, of course.
We should always remember that different branches of mathematical conceptions are key components in computation. We should not forget that truth as long as we think ourselves as a student of computer science.
A short Introduction to Discrete Mathematics
What is Discrete Mathematics? We need to know that first. Second, we need to know what is the relationship of Discrete Mathematics to Computer Science. Finally, we want to know how we can implement the Discrete Mathematical concepts into various fields of Computer Science.
Here, we are primarily concerned with Algorithm and Data Structure. These topics sit at the core of any Computer Science curriculum. Without understanding these two components properly, we can not claim that we are students of Computer Science!
Therefore, I have chosen these two topics and try to implement Discrete Mathematical concepts into them so that our understanding might take a proper shape.
In this chapter, our first and foremost concern is to know what is Discrete Mathematics.
What is Discrete Mathematics
Discrete Mathematics is a branch of Mathematics and, to be very particular, it is the study of Mathematical structures, and objects that are discrete.
In that sense, ‘people’, ‘animal’, ‘chair’ and everything we see around us, fall under this mathematical study. Why? Because, they are discrete, distinct. They are not continuous. Think about real numbers that consist of rational and irrational numbers, or a number line, which is continuous; so that is not discrete mathematics. Rather, natural numbers (N), like 0, 1, 2, etc, which are discrete fall under this category.
Discrete Mathematical conceptions handle with distinct and separated values. The opposite of discrete mathematics is continuous mathematics, such as calculus or Euclidean Geometry. However, Euclidean Algorithm that helps us find the greatest common divisor of two positive natural numbers (GCD), is very much discrete mathematics. We will see to an example in a minute.
Have you found any contradiction in the above statement?
I have found one; and before proceeding further, I think, we should make it clear first. We have just said that real numbers, or a number line where between 1 and 2, there are infinite numbers, cannot be discrete. However, when we say, positive natural numbers like 1, 2, 3, which are distinct and separated, belong to discrete mathematics. 1, 2 or 51, every positive natural numbers are also real numbers. Right? They belong to the number line. Right?
Then where we stand? We are contradicting ourselves.
Therefore, we can conclude that there is no exact definition of Discrete Mathematics. At best, we can say, that the opposite of continuous mathematics is discrete mathematics.
In Mathematics, there are many things that are not continuous. You may think of Set theory, x co-ordinate and y co-ordinate, that refer to different positions on the quadrants, without joining them. You may think of GCD and LCM. Greatest common divisor and least common multiple. They are always discrete.
Consider the statement in logic. True or False. They are discrete. Consider 0 and 1. The binary code, the core conceptions of Computer Science, they are very much discrete.
Consider objects in any object-oriented-programming language. Every new object is discrete.
That is why, Discrete Mathematical conceptions are closely related to every field of Computer Science.
In this book, we will only consider Algorithm and Data Structure part.
In the following part we will compute the number of molecules in a hydrocarbon. First, we will compute this program using Java, after that, we will use C++ to compute the same program.
Notice the algorithm. That is same for two languages, although syntactically there are differences. So the steps of algorithm varies. Let us see the algorithm first.
The algorithm will be like this:
Considering that the algorithm is the same, suppose for both languages, its value is 1. However, the syntactical difference makes the second value change.
If we consider that algorithm is the X axis, and the syntactical difference as Y axis, then the hypothetical value might be (1, 2) and (1, 5). Moreover, they are discrete, distinct and separated.
If we want to connect them drawing a line, that no longer remains discrete. That becomes a continuous mathematical conceptions.
With the help of these two programs, what I would like to point out is a simple mathematical statement. The difference between discrete mathematics and continuous mathematics is paper-thin. It also proves our definition of discrete mathematics. Anything, that is not continuous mathematics, is discrete mathematics.
Let us see the Java program first.
Here goes the output:
I would like to write the Avogadro’s number using the scientific notation this way:
The output changes from 6.019999999999999E23 to 6.02E23, although it should not have mattered how you have written the Avogadro’s number. In the first case,
I have used the Java pow() method. In the second case,I have used the scientific notation. However, the outputs are discrete.
I would like to show you the same program in C++, where I have used the ‘cmath’ library’s pow() function, the same way I have used pow() method in my Java code.
Let us see the program first.
The output is: 6.02e+23. Now, if we change the Avogadro’s number to this:
The output remains the same. In Java, although the value changes.
Now, we can conclude that every programming language is discrete, not only syntactically, but also in their output.
What is the relationship between Discrete Mathematics and Computer Science
We can now understand how discrete mathematics and computer science are related. Through our programming logic and algorithm, we can reproduce continuous mathematics in any programming language, like Java.
Let us see another example in Java.
First the code, then we will discuss the concept.
In the above code, with the help of line-slope and y-intercept, we find the y-coordinate. The value of x-coordinate is given.
A line-slope is the ratio of vertical rise and horizontal run of the line. The y-intercept is a given point on the y axis. If we run the program and provide the values, the output comes out as:
In the above output it is evident that every number is discrete and computing them is easier with the help of a formula. Here the value of y-coordinate is 14, when x-coordinate is 2.
What makes discrete mathematics different from the continuous mathematical proceedings is its capacity of getting counted, discrete structures are always either countable, or distinct. In the above code, we have seen that each vertical rise and horizontal run are discrete, distinct from others. Each time you run the code, providing distinct values will give you values that are separable from the previous ones.
Introducing necessary conceptions
In the following chapters, we will see plenty of examples where discrete mathematical computations will be used.
To name a few, we will discuss set theory and its implementations in data structures. We will see how permutation and combination work together in algorithm. The role of logical statement or truth table will play a great role in our discussion.
We will see how finite collections of discrete objects are implemented in the data structures. They can not only be counted, but also arranged and placed into sets. Studying Euclidean algorithm, along with finding the GCD and LCM, will be a major part in our discussion.
There are many interesting topics to come. So stay tuned and keep reading on.
Question 1: Which part of Mathematics we need to implement in Computer Science?
Option 1: The quantity (number theory) and structure (algebra) only.
Option 2: Only space (geometry)
Option 3: The change (mathematical analysis) and structure (algebra) only.
Option 4: All of the four.
Answer: Option 4.
=======================
Question 2: Which fields of algebra is important for Computer Science?
Option 1: Calculus
Option 2: Linear algebraic equations
Option 3: Both
Option 4: None of them
Answer: Option 4
=======================
Question 3: In which area of Mathematics line, plane and spaces are involved?
Option 1: Matrices and vectors
Option 2: Number theory
Option 3: Linear algebraic equations
Option 4: None of the above
Answer: Option 3
=======================
Question 4: Is a number line falls under the category of Discrete Mathematics?
Option 1: Yes
Option 2: No
Answer: Option 2
=======================
Question 5: Give example of Discrete Mathematics
Option 1: GCD and LCM, Greatest common divisor and least common multiple.
Option 2: Set theory
Option 3: None of the above
Option 4: Both
Answer: Option 4
=======================
Challenge 1 : Write down a program in any language where you can prove that discrete structures are always either countable, or distinct.
Solution to Challenge 1 : Language used Java
package fun.sanjibsinha;
//output of code
Enter the value of the line-slope in positive integer:
6
Enter the intercept of y-axis:
2
Enter the value of x-coordinate:
2
The y-coordinate is: 14. When the slope of line is: 6. Intercept of y-axis is: 2. X coordinate is: 2
Challenge 2 : Can you prove using a programming language how the Set theory is widely used in Computer Science.
Solution to Challenge 2 : Language used SQL.
In order to demonstrate and explain the set operators in SQL effectively, we will be using the following tables.
These sample tables are “customers_of_jan” and “customers_of_dec”. These tables contain 5 records each with the customer’s id, name, city.
Let’s have a look at the records in two tables. So that later, we can understand how set operations are helpful.
If we use the following code, it will create problems.
Because, there are two John and two Mary combining both tables. As a rule of thumb, the above SQL atatement will choose only one John and one Mary.
However, implementing the concept of SET Theory in Computer Science, we can use UNION in the following way.
Union ALL operator will take two John and two Mary into consideration and returned John twice. The same is the case for Mary.
Conclusion: The above challenge and solution proves that without the Set Theory, which belongs to the Discrete Mathematics, we could not build any kind of Relational Database.
2. Introduction to Programming Language and Boolean Algebra
Programming language is a formal language that contains set of instructions, which produce various kind of output. It can either be imperative or declarative.
Imperative means it has sequence of operations to perform. On the other hand, declarative stands for the specification of desired results, it does not comprise of how to achieve it.
There are thousands of programming languages, many are being created everyday, probably! However, most of them are imperative. They comprise of certain commands or instructions for computers to perform certain tasks, producing various outputs.
Let us try to understand this part very clearly.
In natural language, such as in English, we have encountered imperative mood where a verb is used to form a command or request. Consider this example: “do it”, or “Leave” and many more. Mostly it is directed towards a second person, but in some certain contexts, it can involve first ot third persons, such as, “Let’s do it”.
The same way, an imperative programming also comprises of certain commands, set of instructions that focus on how a program should operate. Consider a simple procedural imperative programming example in C programming language.
In the above code, we have seen the function (method) declaration that gives us a standard output when the function is invoked. In the main() function area, the compiled code will run the code as we invoke the function doSomething(). It gives us the following output.
Quite simple and straightforward.
The above code snippet gives us an idea of how procedural programming works. It is a part of imperative programming paradigm. It is called procedural, because, the program is built on one or more procedures or functions. It is sometimes also called subroutines.
Now question is, where does discrete mathematical conceptions stand here? What is the relationship between a procedural programming and discrete mathematics?
The simple answer is, a program should build either successfully or failed miserably. It cannot go on continuously. So a program must be either true or false, 1 or 0, to be more precise.
It should be discrete. We need to compile the code successfully, that is discrete. We give the instructions, that should be discrete. And, finally, we need the output, that should also be discrete.
Now, imperative programming does not end with only procedural programming. There is object-oriented-programming paradigms. A C++ program could be a good example.
Let us see the code first.
The above guy does the same thing, giving us the same type of output, only in a different manner. This C++ guy does the same thing discretely. In a distinct way. We cannot let it run continuously.
I hope, now it is clear to you why we need to understand discrete mathematical conceptions; it is needed to understand our code in a better way. Discrete mathematics is a part of mathematical conceptions that are not continuous. We cannot let our code run continuously. So we need to understand discrete mathematical conceptions.
I am not going to elaborate this section anymore. There are several other types of imperative programming. There are several other types of declarative programming languages. This is not the place to discuss them.
It is only imperative to know that our code cannot run continuously. Either it should build successfully or fail miserably; whatever the outcome, it should be discrete.
Logic, Mathematics, and Programming Language
Does logic come before mathematics? As long as humans are concerned, the answer is “Yes”. A child without knowing any mathematical conceptions, knows that what is hot and what is cold. She develops her logical experience by observation and actions. Is it hot or cold? The answer is yes or no. Accordingly, her actions take place.
Latter in life, she applies her sense of logic in understanding mathematical conceptions. Logic comes before mathematics and after that comes algorithm or steps; finally we write code based on that algorithm.
In programming language, logic plays the greatest role; because it leads to the acceptance of one proposition, the conclusion, on the basis of other propositions or premises. Moreover, the conclusion is discrete. That is why, in programming language, the truth table plays an important role. In programming language, if, if-else, or else conditionals lead us to one inference.
In ordinary life we also apply the same logic by observation.
Let us see a very simple program in Java, to see how this logical inference matters.
We have entered three different number and got the following output:
The above Java code, is based on a few principles that we can call ‘truth table’, based on which we build our algorithm. These logical steps are universal. It is not only true for the Java guy, but also applicable to every single programming paradigm.
What is that?
There are three logical operators; they are ‘&&’ symbol that stands for ‘and’; ‘||’ symbol that stands for ‘or’; finally, we have a ‘!’ symbol that just converts true into false and vice versa.
In case of ‘&&’ symbol, if both statements are true, it comes out as TRUE. For the ‘||’ symbol, if any one of the statement is true, it comes out as TRUE. And you have already known about the nature of ‘!’ symbol.
These logical operations are dependent on conditional operators; they also have different symbols, such as ‘==’, ‘!=’, ‘<’, ‘>’, ‘>=’, and ‘<=’; when two conditions are ‘==’, (equal), we perform some operations and so on.
We can conclude that, logic, mathematics, algorithm and code are inter-dependent; and, they should be discrete, as well.
Introduction to Boolean Algebra
George Boole, the founder of Boolean algebra, is considered to be one of the founder of Computer Science also. In 1847, when he wrote his famous book “The Mathematical Analysis of Logic”, he had not thought about PC, mobile or tabs. I don’t want to say that. But in his second book, “An Investigation of the Laws of Thought” that he wrote in 1854, he clearly set the path for future computations.
His great idea started a new branch of algebra, Boolean algebra, where the values of the variables are the truth values: true or false. They are usually denoted by 1 and 0 respectively. In elementary algebra the values of variables are numbers. There are several operations we can do on that like addition or multiplication.
In Boolean algebra, there are only three operations we can do: conjunction (and), disjunction (or) and negation (not). Now we are able to the logical operations as we use to do the numerical operations in elementary algebra.
Moreover, it helps programmers to create a formal description of logical operations with the help of conditionals like “if, else if and else”, and in some cases using “switch-cases”.
Now it adds great impetus to every modern programming language.
Not only that, with the help of this ‘truth table’, we can build a very complex decision trees.
Let us start with a simple example.
Let us see the code first.
Watch the output:
Globally we have made the boolean variable ‘isTrue’ true. Therefore, the first ‘if’ block allows us to enter into the block. Next, we have made the ‘isTrue’ false.
Now according to the boolean algebra and truth table, ‘false and false’ is ‘false’; and, ‘false or false’ is also ‘false.’ For that reason, we get the above output.
Now, we would like to change this code a little bit.
Watch the output now:
Now, in the second conditional, ‘true and true’ is ‘true’. So, we get the above output.
Let us see more examples and watch the output one after another to comprehend how this truth table works.
It’s quite obvious that very first conditional is true, so we enter the block. However, the next conditional comes out as false, because in the truth table ‘true and false’ will come out as false. Therefore, the code will check the next conditional, which is true.
So the output will be as the following:
As the above code snippets, we can apply the same truth table on other data types using the logical operators. Here is an example:
Let us give different types of ‘age’ to check how our code works.
Let us make this example a little bit complex, so we can have an idea about how complicated this combinations might become.
Let us give some different types of input as ‘true’ or ‘false’ and see how our code responds.
Like to make the combinations more complex? Well, we can try the following code snippets.
As usual, we will give different types of age to test how this combination works by maintaining the truth table rules.
The above code establishes one thing, by implementing the proper usage of the truth table, we can stop the middle conditional to override the basic rule that has been defined earlier. The above code snippets check the entry to some places. In between there is a boolean value called ‘isAllowed’; you may think this guy as the gatekeeper who can override the entry with a special power. In fact that happens in the real life.
However, through the proper usage of the truth table we have limited his power to override the main conditional that says, for the age range of less than equal to 10 and greater than equal to 70, the entrance fee is zero.
Now if the gatekeeper wants to take price from that age range, he cannot do that.
As we progress, we will see more examples of boolean algebra in the future course of our book. So stay tuned and keep reading.
Question One: In Boolean algebra, how many operations we can do?
Option 1: One
Option 2: Two
Option 3: Three
Answer: Option Three. (In Boolean algebra, there are only three operations we can do: conjunction (and), disjunction (or) and negation (not).)
Question Two: Will ‘true and false’ come out as false in the truth table?
Option 1: Yes.
Option 2: No.
Answer: Option 1.
Question 3: What is the imperative programming language?
Option 1: The programming langauge that deals with Discrete Mathematics.
Option 2: The programming language that comprises of certain commands or instructions for computers to perform certain tasks, producing various outputs.
Option 3: None of the above is true.
Option 4: Both statements are true
Answer: Option 4
=======================
Question 4: What is the relationship between a procedural programming and discrete mathematics?
Option 1: The common relationship is that they both deal with Boolean Algebra.
Option 2: A program should build either successfully or failed miserably.
Option 3: Both cannot go on continuously.
Option 4: All of the above.
Answer: Option 4
=======================
Question 5: A child develops her logical experience by observation and actions. Is that true for Mathematics?
Option 1: Yes
Option 2: No
Answer: Option 1
=======================
Challenge 1: Can you write a program based on the <<Truth Table>>
Solution to Challenge 1: Language used Java
Challenge 2: Write an algorithm in natural language that builds a very complex decision trees.
Solution to Challenge 2: Language used Java
Challenge 3: In Boolean algebra, there are only three operations we can do: conjunction (and), disjunction (or) and negation (not). Can you prove this paradigm in a complex “Truth Table” that uses conditional logic.
Solution to Challenge 2: Language used Java
Challenge 3: Can you write a program in any langauge that will charge money if the user’s age is between 10 and 70.
Solution to Challenge 2: Language used Java
3. De Morgan’s Laws on Boolean Algebra, Logical Expression, and Algorithm
In this chapter we will learn about basic algorithm, which has its roots in De Morgan’s laws on Boolean algebra, and logical expression. After learning about basic algorithmic steps and sequences, we will discuss data structures in the next chapter.
To build complex algorithm, we need to understand the core concepts about data structures (chapter 4); we will come back to more advanced concepts of algorithm again in chapter five.
Let us start this chapter with Boolean algebra.
Augustus De Morgan was a contemporary mathematician of George Boole. Although he did not create the laws using his name, yet it is credited to him, since he was the creator.
De Morgan’s laws are based on Boolean algebra, and in every programming language, it is widely applied and equally true.
What the rule states, we can write this way, where ‘a’ and ‘b’ are two boolean values (true or false):
Let us apply this laws in PHP. We have stored the first law in ‘DeMorganOne.php’ file. Let us see the code first:
We have tested the first law by passing the same value through two class variables and methods; we have obtained the same result.
Now, you can play around by passing different types of value to see how this law works. Whatever the values you pass, they must be same for two member methods and you will get the same result.
To test the second law, we have created another PHP file ‘DeMorganTwo.php’, where we have done the same thing, except that the logical expressions have been changed.
We have tested the second law by passing the same value through the class variables and methods. Watch the output, it gives us the same value for two different methods.
In Java, or C++, you can apply the same logic to test that the laws work. Consider the following Java file where we can comment out the entire process, because we need to test the same code separately.
Inside the commented out sections we have kept the code and output together. In Java, you need to test each law separately. In PHP, we could have used a form inputs to build a web application where we can pass two values to see the result dynamically.
Logical Expression
We can create compound expression by combining logical operations. De Morgan’s laws are based on this paradigm. Consider the following expression:
Whether the above compound expression is ‘true’ or ‘false’,depends on different types of combinations. If ‘a’ and ‘b’ are both false, the negation of sub-expression (a or b) is true. If any one of them is ‘true’, then the value will be ‘false’, and this combination may take different shapes according to the ‘truth table’, which we have seen before.
A major part of Discrete Mathematical operations is based on Boolean Algebra and the associated logical expressions. Just to recapitulate, we need to remember that there are three logical operators; they are ‘&&’ (and), ‘||’ (or), and ‘!’ (negation). The ‘truth table’ is based on them.
Logical operators manipulate the logical values. The same way, ‘relational operators’ also manipulate the logical values.
There are two kinds of relational operators: equality and ordering.
The two equality operators are: ‘==’ and ‘!=’. The ‘==’ operation is true when the two operands have the same value. The same way, ‘!=’ operation is true when two operands have different values.
The ordering operators test the relative size of two values. They are: ‘<’ (less than), ‘>’ (greater than), ‘>=’ (greater than or equal to) and ‘<=’ (less than or equal to).
Short Circuit Evaluation
Before all the operands have been considered in the evaluation of any logical expression, we sometimes know the value of the expression. Consider a situation, where two operands are using ‘and’ operation. If any one of the operands is known to be false, we instantly know that the result is ‘false’. On the other hand, when two operands use the ‘or’ operation and any one of the operands is known to be true, then we know that the value of the expression is true.
A programming language requires that the left operand (the first one) be evaluated before the right (the second one) operand. If the value of the logical expression is determined from the first operand, the second operand is not evaluated.
This type of evaluation is known as short circuit evaluation and bot ‘and’ and ‘or’ operations use this kind of special evaluation. To make long story short, the second condition is not checked, depending on what type of operations take place, whether you are using ‘and’ or ‘or’; moreover, what is the value of the first condition.
We will check the both cases, using Python. We are going to use Python 3.6.
Read the comment section. Besides, we can get the idea from the output:
Syntax, Semantics and Conditional Execution
So far we have seen many usages of ‘if’ statement. In a program, when the ‘if’ statement is reached, it first checks whether the operation is true or not. If it is true,it is executed, the code between the ‘if block’ is acted upon. Otherwise, if the ‘action’ is not acted upon inside the ‘if block’, program execution continues with the next statement in the program.
The description of the execution part of any ‘if’ statement, is called ‘semantic’ definition.
Syntax and Semantics
We need to take a very quick look at these two guys – syntax and semantics. They are very essential in every programming language.
Here the rule of natural language follows. Syntax describes the rules by which the words can be combined into sentences. On the other hand, semantics describes what they mean.
Consider a simple example.
In the above sentence, the syntax and semantics are both flawless. There is no syntactical error and the semantic definition is meaningful.
However, what about the next sentence?
It is also syntactically correct. There is no syntax error. But, is that sentence meaningful? Semantics describes what the sentence means, and it means nothing. We neither give name to our chairs, nor we introduce them like this.
In a programming language, syntactical rules are important. We should not miss a semicolon after an expression in many languages like C++, Java, PHP, etc. But we should not use semicolon in Python, in the same situation. That is syntax. We should maintain those rules.
We cannot use the keywords or reserved words as variable or function name. That part is OK. But what about the semantics?
That is equally important. If our logical expression is wrong, the program is not meaningful anymore, it takes inputs and gives us erratic output.
In the next two programs, we will see how this syntax and semantics work together in two different programming languages.
We have used Python to create a base of calculation using the ‘if-else’ logic.
We have only used one option to test that the program runs fine, when the denominator is zero.
In the next program, we have used the same logic for base of calculation, in a slight different way, in C++, using the ‘switch-case’ statement. Compare the syntax between these two programs, semantically they are equal, rather meaningful.
We have tested the code in various ways, to find out the semantics is meaningful and the code runs in every possible situation.
From previous code snippets we have learned two important lessons. A program should be syntactically correct, as well as semantically correct. If we write a same program in two different languages, their syntax may be different but semantics is same. There are also two types of semantics – one is known as ‘static semantics’ and another known simply as ‘semantics’.
By the term static semantics, we mean program runs well, gives us no errors, but at the end of the day it is not meaningful. It gives us outputs that were not intended while we wrote the code.
Full semantics, on the other hand, may run the loop for ever or simple crash the program, while we try to run it.
In the next program, we are going to sort three numbers in ascending order. Here semantics plays a very vital role.
Why? We will see in a minute.
Syntactically and semantically this program is clean and it reflects in the output:
In the above program, you can change the static semantics just by changing the positions of the variables; in that case, your code snippets is syntactically correct, and it is built successfully and runs correctly without crashing the program. However, the output will be erratic, because the static semantics is incorrect.
The role of semantics as a whole becomes increasingly important as the logical expressions get complicated. Not only that we always write code with the help of ‘if-else’ or ‘switch-case’; we need control constructs, different types of looping, we need to write complex algorithm, etc.
While write our program that way, we need to keep one thing in mind, what we write should make sense, it should be meaningful. The concept of semantics need to be understood for that reason.
Why we need Control Constructs
We need it because computational thinking gives us enough power to write sequence of steps or recipe for doing any repetitive job. Suppose we need to find out average of a finite set of different numbers. We might imagine doing this for 5, 6 or 10. However, when the list grows and reaches 100000, it becomes impossible.
We need to find out some solution to do that. Suppose our application is programmed to take 100000 inputs from a file where the numbers are stored. Can we enter them manually and see what would be the output?
Consider a program like the following one:
For 6 numbers it is OK. The output gives us the proper value.
However, this type of operations is better handled by iteration using the ‘while’ statement. What we have seen in the above code makes us believe that we need an action that should be repeatedly executed. We add two numbers and get a total. Next, we add the third number with the running total. It will go on as long as the number of values processed is less than the finite set of numbers that we want to add and then divide by that number to find the average.
Now we need to map that problem to our program domain with the help of ‘while’ statement. Because the ‘while’ statement deals execution of any repetitive action better than any other statement, we can write it using ‘natural language’ this way:
Now the time has come to map this problem on our program domain, this way:
And here goes the same output that we have seen in the previous code (3.8).
In this section, we have learned many important concepts. You have probably noticed that we are handling with discrete numbers. We are also talking about a finite set of numbers. It is an integral part of discrete mathematical operations.
In discrete mathematics, we almost always quantify. We always check the ‘existential’ logical expression like ‘if there is’ or ‘if there exists’, etc. Moreover, we also check for ‘global’ values that is meant ‘for all’. For all the numbers inside the finite set of numbers, we are adding them one after another; that leads us to a grand total. We also count how many numbers are there and divide the grand total by the total numbers present inside the finite set.
As we progress, we will see how these concepts come handy for the functions. How set theory is pertinent for collections or data structure, etc. There are a lot of things to cover and we are afraid that we won’t be able to cover everything. However, we will try our best to learn a few things, so that in future we can take that knowledge forward.
And by the way, we have also learned what algorithm means actually! To make the long story short, it is a sequence of instructions to solve a problem.
In the next section, we will cut into the subject and turn over the topic to learn more!
Discrete Mathematical Notations and Algorithm
One of the main branches of computer science is algorithm. One of the main branches of discrete mathematics is also algorithm. That is why we use concepts and notations from discrete mathematics in computational algorithm. We can map any problem from the mathematical domain to program domain with the help of same algorithm.
When you combine mathematics and computation, algorithm means a well-defined instructions that are computer-implementable. Yet, mathematics is a separate domain, when we try to map one mathematical problem into computational domain, we need a sequence of instructions that should be unequivocal, which means the algorithm should exhibit a single clearly defined meaning. A distinct meaningful output should come out from the inputs.
Now, we have learned, what algorithm is, however, we must know why we need it.
The question is why we needed algorithm four thousand five hundred years ago in Babylon? Why we needed it three thousand five hundred years ago in Egypt, or later in Greece?
The answer is: to decide something. When we travel by one car and come to a road-divider that indicates two ways, we cannot go to two ways simultaneously. Our decision should be discrete. 1 or 0. True or false.
In contrast, if we have two or more cars, the decision might be something else.
Although ancient Babylonian, Egyptian or Greek mathematicians started using the concepts of algorithm since antiquity, the very term ‘algorithm’ is derived from the name of the ninth century Persian mathematician Muḥammad ibn Mūsā al-Khwārizmī. Much before that, Greek mathematicians used sieve of Eratosthenes to find prime numbers; they also used Euclidean algorithm to find greatest common divisors (GCD).
Decision making very heavily depends on effective calculation. In the last century, many renowned mathematicians worked on that and still it goes on. High level programming languages have to come to terms with that. They have to do that, because as time passes by, the size of data has increased, and it will increase with the passage of the time.
We have enough theoretical discussion, let us plunge into code to understand how we can map our problems from mathematical domain to our computational domain.
Let us start with prime numbers. In ancient time, Greek mathematicians used sieve of Eratosthenes to find prime numbers. We have plenty of other solutions at our hand now. Still, we need to know what does a prime number actually mean.
A prime number is a natural number that has exactly two discrete natural divisors. Consider this example: 2 is a prime number, because there are exactly two divisors: 1 and 2. The same way, 11 is a prime number, because there are exactly two divisors: 1 and 11.
Based on that concept, we can write our algorithm in natural language, this way:
Let us tale that algorithm to our computational domain using Java programming language.
We can test the program by giving two inputs like 49 and 47.
Here, in the above code, the algorithm is one of the simplest. We have counted the number of factors of any number and test the condition, whether that number crosses 2 or not.
Let us solve the same problem with the help of a different algorithm.
First, we see the code, then we will discuss the algorithm used in it.
In the above code, we have used a boolean method that uses one parameter; now we can pass any number to test whether that number is prime or not.
We have used the trial and division method to find out whether the output is true or false.
The sieve of Eratosthenes algorithm works on a different type of algorithm. Let us first write the algorithm in natural language. By the way, we should remember that the sieve of Eratosthenes algorithm is used to find out prime numbers in a range of numbers, such as we can test how many primes are there between 2 and 30. Usually the end number is denoted by ‘n’; for the sake of simplicity, we consider an integer. The algorithm goes like the following:
Let us build our program based on this algorithm. This time we have used python 3.6, to get the result. Each step is mentioned inside the comments, we have used in this case.
Here is the output of the above code where we have passed 20, to get all the primes below 20.
In this algorithm some numbers are marked more than once, like 8, for 2 and 4, both. The main idea is when the number is composite, that is, when it is a multiple of some prime numbers, it is marked.
Since, every even integers are marked out we list odd numbers only (3, 5, …, n), and count in increments of (2 * startingNUmber), thus marking out only odd multiples of ‘startingNUmber’.
After that, multiple of primes becomes composite, having more than two factors, making them composite.
The same way, thousand years ago Greek mathematicians used Euclidean algorithm to find the greatest common divisors of two numbers. Originally it was subtraction based, later the same algorithm had been written numerous way, re-modeling the original one.
We will see those versions later when we will discuss Euclidean algorithm in detail. However, we can take a quick look at the original Euclidean algorithm in a Java program, as shown in the following code snippets.
We can take any two numbers and see how this algorithm works.
The Euclid’s Algorithm is one of the oldest algorithms that is still relevant to, not only discrete mathematical conceptions, but also in the computational world of 1 and 0.
In the above program, we have seen that the effectiveness of the algorithm has been proved by the correct output from given inputs.
Now, many things depend on algorithm. Like hardware, algorithm is also considered to be technology for one reason. Every algorithm has its own time-complexity. When an algorithm takes higher time to produce an intended result, it is considered to be non-optimal. On the contrary, less the time-complexity, higher is the desirability.
Therefore, these two parts are very critical while we consider an algorithm. How much time it takes to perform the algorithm, is a big issue. On the other hand, adaptability of the algorithm to computers is another big issue.
As far as Euclidean algorithm is concerned, we can make this algorithm faster by making it recursive based. We can also write the same program in different ways, using Python 3.6 in this case.
The output is quite expected:
We always face a trade-off between elegance and speed. Some computer scientists feel, the smallest possible program for producing the output is the most ‘elegant’. The most optimal algorithm is the most desirable. Speed matters.
One criterion is definitely the time taken to perform the algorithm, another is its simplicity, which is also termed as elegance.
While we write any kind of program, we need to keep those things in mind.
We have learned that the time taken for running an algorithm is important. It is measured by the ‘Time Complexity’. Our final goal is to improve the performance of any algorithm. To do that, we need to count the number of elementary operations performed by the algorithm. ‘Time Complexity’ does the same thing.
We will cover these concepts in great detail in chapter 5, after understanding data structures. Understanding data structures is needed for one reason: for building complex algorithm, we use various types of data structures.
To understand basic algorithm, we have used array, a basic collection of a definite set of elements. Before concluding this chapter, we will take a look at three code snippets, where we have sorted a definite set of discrete integers and arrange them in ascending order.
To make this operation successful, we have used Quicksort algorithm.
The Quicksort algorithm is a popular sorting algorithm that is often used not only for sorting numbers, but also objects from any custom class. Furthermore, there are many other sorting algorithm, which are complex, and we will learn them in the coming chapters.
Before we present the first code snippet, using Python 3.6,we need to know one more thing about Quicksort algorithm. It is of an average ‘Time Complexity’ and it is represented by Big O notation, as O(nlogn).
For the beginners, it may appear intimidating at first, yet it is not, when you understand the principles behind this Asymptotic notations. Again, to remind you, we will cover this in great detail in chapter 5.
Let us see the first Quicksort algorithm example.
Let us first see the output first, after that we will discuss the Quicksort algorithm.
In the above code snippets, if you go through the comments,you will get the idea. We need a collection that has a definite set of elements, here integers. We can think it as ‘input’ or ‘n’, which is a variable. Now as the value of ‘n’ increases, the ‘Time Complexity’ varies.
In the above algorithm, we keep testing the value of the integer with respect to the search index number, and keep the larger number on the right side. It arranges the collection in an ascending order.
Now, almost same thing we are going to do using C language. In the first case, we will have a definite set of integers, just like above. In the second case, we will take input from the users and apply the Quicksort algorithm.
Let us see the first code snippet that handles definite set of integers.
Again, we have written our sequence of steps in the comments. Just go through it and you will get the idea. We have solved the same problem, only in a different way.
The next Quicksort algorithm will take inputs from the users. However, there is a limit that we can control. Crossing that limit will give you an error, although a simple ‘if-else’ statement will solve the problem for us.
The four different output shows how we have changed the program to fit the Quicksort algorithm.
First we will take a look at the output, that will explain why we need to change this code a little bit to get it done perfectly.
Here in the above code, we have set the limit of inputs to 6. Initially we have followed the rule and entered 6 integer value and we have got the perfect ascending order.
However, what happens, if someone breaks that limit and decides to enter 8 elements?
Here is the output:
We have entered 8 elements and it breaks the code. The code is not working anymore.
To avoid such incidents, we need to change the algorithm a little bit.
Now, we are ready to face the worst case, where users will cross the limit of inputs and get the warning.
Now, once the user crosses the limit, the warning gets displayed on the screen.
If the user follows the rule, it works perfectly again.
In this chapter, we won’t go any further about algorithm. In the next chapter,we will learn about data structures.
After that, in the following chapter, we will follow up complex types of algorithm using data structures.
Question 1: In De Morgan’s laws this statement is true: not (a and b) is the same as (not a) or (not b)
Option 1: No.
Option 2: Yes.
Answer: Option 2
===================
Question 2: A programming language always checks that the left operand (the first one) be evaluated before the right (the second one) operand.
Option 1: True.
Option 2: False.
Answer: Option 1
===================
Question 3: How many logical operators are there?
Option 1: Two
Option 2: Three
Option 3: Four
Option 4: None of the above is true
Answer: Option 2
=======================
Question 4: What is known as short circuit evaluation?
Option 1: A programming language always checks that the left operand (the first one) be evaluated before the right (the second one) operand.
Option 2: Both ‘and’, ‘or’ operations use this kind of special evaluation.
Option 3: the second condition is not checked, depending on what type of operations take place.
Option 4: All of the above.
Answer: Option 4
=======================
Question 5: Find a flaw in this sentence - “Here is my chair, Emilia.”
Option 1: It is syntactically wrong, but semantically correct.
Option 2: It is semantically wrong, but syntactically correct.
Option 3: It is semantically wrong, and syntactically wrong.
Option 4: None of the above is true.
Answer: Option 2
=======================
Challenge 1: De Morgan’s Laws on Boolean Algebra states the following rule:
One of the rule states, we can write this way, where ‘a’ and ‘b’ are two boolean values (true or false):
Based on this pribciple can you write any program in any programming language?Remember, in every programming language, the steps or algorithm will be the same.
Language used: PHP 8
Solution to Challenge 1:
// output
Challenge 2: Can you write a program where the semantics (meaningful program) plays a very vital role.
Clue: If we write a same program in two different languages, their syntax may be different but semantics is same.
Language used: Python 3.10.0
Solution to Challenge 2:
// output
Challenge 3: The ‘while’ statement deals execution of any repetitive action better than any other statement. Can you write it using ‘natural language’ first? After that translate that into any programming language.
Language used: Python 3.10.0
Solution to Challenge 3:
The steps in Natural Language first:
// output
Challenge 4: Greek mathematicians used Sieve of Eratosthenes to find prime numbers. Can you translate Sieve of Eratosthenes in any programming language where you need to prove that a prime number has exact two divisors. (An example: 11 has two divisors: 1 and 11.)
Language used: Java
First Solution to Challenge 4:
Steps or Algorithm in Natural Language:
The program:
// code
// output
===========================
Second solution to Challenge 4
Language used: Python 3.10.0
The algorithm in Natural Language goes like the following:
The program:
// code
// output: all the prime numbers below 20.
Challenge 5: As the value of ‘input’ or ‘n’ increases, the ‘Time Complexity’ varies. Prove it using any programming language.
Solution to Challenge 5:
Language used: C
// code
First we will take a look at the output, that will explain why we need to change this code a little bit to get it done perfectly.
As the ‘inputs’ increase, it breaks the code as the time complexity varies. We cannot enter more than 6 values.
However, we can solve this problem the follwoing way.
Language used: C
// code
// output
4. Data Structures in different Programming languages
The core concepts of data structures in different programming languages are almost same, although the implementation varies syntactically.
Moreover, algorithm or sequence of instructions is deeply associated with the core concepts of data structures. We will see to that in a minute. Before that we need to know a few elementary discrete mathematical algebraic concepts, which are also associated with data structures.
Before cutting into the core concepts of data structures, theoretically we need to understand one key concept. The concepts of data structures in various programming language inherit its roots largely from discrete mathematical algebraic conceptions that are known as date set.
What is data set? A collection of integers.
We can write it like this:
We started storing data and started doing operations on them much before any programming languages’ forays into data structures. In that sense, we have implemented those old concepts of algebraic data set quite recently into programming languages.
We are, rather forced to do that. The volume of data is increasing faster than ever. Effective means of algorithm to sort those big data is getting more attention than ever.
In programming languages, we had thought of ‘array’ first; then, we found ‘array’ was not enough. Therefore, we implemented conceptions like Stack, Queue, Linked List, Trees, Hashing, etc. Data structures are now used in in every programming language to to store data in an organized and efficient manner. And to do that, we need efficient algorithm. That is the basic concept. Moreover, discrete mathematical algebraic data set, algorithm and data structures are twisted and entwined into a meaningful entity.
Let us clear some algebraic concepts about data set. After that, slowly we will enter into the world of complex data structures.
Stay tuned and read on.
Mean, Median and Mode
You have probably noticed that in discrete mathematics, there are always two conceptions go together. One is ‘order of operation’, and the other is ‘distributive property’.
Consider some factors like this:
Now, we can rearrange the same factors like this:
Therefore, order of operations and distributive property works pretty well with addition and subtraction. However, this will not work with multiplication and division. Just try it.
Any algebraic operation is nothing but a kind of algorithm. In the above case, our algorithm fails for two separate cases. Our algorithm works very well with addition and subtraction; but, it does not work with multiplication and division.
The same thing might happen for a data set. In a collection of numbers, we might try to apply the same algorithm. Consider a data set like this:
A data set inside a data set. We cannot apply our algorithm here as well. We need to use of ‘order of operation’ rule to get the addition done inside the sub-data set first.
Now, we understand one thing, what we can do with discrete integers, we cannot always do with a collection of integers. We need to invent some different algorithm.
In any algebraic data set, there is always an average of the collection. Consider this data set:
The average of this data set is : (6 + 2 + 3 + 8 + 1) / 5, or 20.
This is also called ‘Mean’ of a data set. In mathematics there is other ‘mean’ as well. The geometric mean, we may have heard of that.
The word ‘mean’ means many things in our natural language also; but, in algebraic data set, it means the average. No lexical disambiguation, please.
On the other hand, in any algebraic data set, the Median represents the middle value of the data set. In any collection of integers, which is odd in numbers, finding the Median is quite easy.
In the above data set, 3 is the Median or middle value. But it is not true when the number of integers in a collection is even.
Consider the following data set:
Finding the Mean is quite easy. It is not related to the odd or even numbers of the collection. But, in the above data set, finding the Median is not that easy. To find the Median we need to find the middle value of 2 and 3; because they are in the middle.
The middle value of 2 and 3 is 2.5. Therefore, that is the Median of the above data set.
Another important thing of a data set is the Mode. In some data sets, one value or more than one values most often appear. Consider this data set:
In the above data set, there are more than one values that repeatedly appear. Right? We can see that 1 and 56 repeatedly appear two times, 7 appears three times; moreover, the integer 3 repeats five times. Therefore, the most often appeared integer is 3. It is the Mode of the data set.
Reading so far, we may ask ourselves, why we need to know this before studying data structures?
Well, there is an answer.
The sum of a collection of numbers divided by the count of numbers in the collection means the arithmetic mean; it is true for mathematics and statistics, as well. So the ‘Mean’ of a data set is often referenced as ‘arithmetic mean’ for lexical disambiguation. As we have learned before there are ‘geometric mean’ or ‘harmonic mean’.
As far as programming language is concerned, sometimes we need to find out the arithmetic average income of a nation’s population, which is known as per capita income.
Now, we can logically guess that by using ‘Mean’ of a data set, we cannot represent the central tendencies of a data set. Consider a data set, where a people’s income is much, much greater than most people’s income. Theoretically, the nation’s per capita income shows a very good central tendencies, which is a false statement; in reality, in that country, 70 percent of people live under poverty line.
In such situation, the ‘Median’ may be a better description of central tendencies.
Why? Because, the ‘Median’ separates the higher half from the lower half of a data sample. For a data set, we have seen that it represents the middle value.
Whereas the ‘Mean’ can be skewed and twisted to give a false representation of central tendencies, here is the basic advantage of the ‘Mean’. It may give a better idea of a typical central tendencies. It cannot be skewed using a small portions of extremely large values, compared to a large portions of extremely small values. If only more than half the data are represented by false, and extremely large values, the ‘Median’ will give an arbitrarily large or small result.
Array, the First Step to Data Structure
We know that an array is a sequential collection of elements. These elements are of same type. They are stored in memory sequentially. An element of an array can be obtained through its respective index. An array element in Java is treated as an object, that is not true in C.
As a definition, we can say that an array is a data structure that holds a similar type of elements. On the other hand, it can represent a large collection of algebraic data set. We can generate a random set of elements in an array like this:
The above code is a sample of code snippets that can generate a random data set of big integers. Here we have produced only five specimens.
We have an idea of how we can manipulate data in a big way. It is just a very small sample.
We can search any value in the random output, like this:
We are trying to find whether 121 belongs to the randomly generated array of integers. The answer will come out as ‘false’. You have probably noticed that we have used this line in our code while generating the random integers.
Because ‘Math.random()’ method produces ‘double’ data type,we have to cast it to a round figure by using ‘int’ data type. Each time you run the code, you will get the same output. If you change the value from 121 to a higher value,the output will be higher.
Therefore, the output is quite expected.
The element 121 does not belong to the randomly generated collection of integers.
As an object-oriented-programming language, Java treats this type of collection differently than C. In Java an array is a container object that holds a fixed number of values of a single type. Whenever we create an array, the length of the array is established. After creation, the length of an array is fixed.
In Java, whenever we want to create an array object, we write something like this:
However, in C, this is different. According to some computer scientists, C stands between problem oriented high level languages like Fortran, Basic and Pascal, and the low level machine languages like Assembly language or Machine language.
Any C programs consists of one or more distinct units called ‘functions’.
In C, like in the above code, we declare array, this way:
Whatever be the language type, function based or object-oriented-programming based, the accessing of an array element is the same. This is done with ‘subscript’; all the array elements are numbered, starting with 0. In C,what is known as subscript, in Java the same is known as numerical index. In Java each item in an array is called an element. In C it is also known as ‘dimension’.
The Mean, Median and Mode of algebraic data set is widely used while we manipulate any array. Consider the following example where we have calculated the average marks of 10 students. The user is asked to give inputs and the program calculates the Mean of the data set.
Let us see the output of the above code.
In the above code, this line is important:
We have expected that the output would be a floating point value. In every language, we use all types of primitive data types to declare the type of the array.
In Java, we can declare arrays of many types, this way:
We can use the shortcut syntax to create and initialize an array, almost the same way.
In Java, we use the shortcut syntax this way:
In C, we can use the shortcut syntax almost the same way. Sometimes, if the array is initialized where it is declared, we need not mention the dimension.
Whatever language we use, the main advantage of array is it helps us to save the memory of the system. We can allocate memory dynamically; when the memory allocation is not dynamic, it stores the data in contiguous memory locations. This process makes the program faster than other data structures.
The idea of algebraic data set and the conceptions regarding the Mean, Median and Mode will also help us understanding the manipulations of complex arrays. Before going to that section, let us understand some simple features of arrays. To do that we will use Java language, as it is one of the easiest languages to learn.
Let us understand some Array features
In Java, array is an object. However, we also need the help of primitive data types to declare and initialize an array object.
As we have seen earlier, whenever we create an array object with the help of ‘new’ keyword, the memory is allocated.
Inside the comments in the following code snippet, we will see the relation between the numerical index and the element of an array.
In this section we will give the output along with the code snippet to understand how the program works.
In Java, while we declare an array we usually take help from the primitive data types. That sounds logical as we need to tell the compiler what type of array we are going to create. According to our declaration, the memory will be allocated. One single exception is the non-primitive data type ‘String’. The following example will show you how we can handle that.
Read the comments carefully, we will learn many interesting things about an array. The proper initialization of an array varies from one programming language to another.
Java allows two types of initialization. However, one type is strongly discouraged by the official documentation.
Watch the next example, it will show you the proper way of initialization in Java.
We have compiled and run the code successfully with the help of improper initialization. However, the proper initialization is shown below:
As long as we use the Java, we will keep using the proper initialization of array.
In C language, we have seen the shortcut syntax before, the next Java program has used the shortcut syntax to create and initialize the array.
An array can contain another array, or sometimes, in some special cases, multiple arrays. We call them multidimensional array. The components of a multidimensional array are themselves arrays. The following example will show you how we can use multidimensional array.
When we work with multidimensional arrays, the built-in length property helps us to determine the size of an insider array component. In that case, the numerical index points to the built-in array like the following code snippet.
The arrays with numerical indexes 0 and 1 have length 3, while the array having numerical index 3, is of length 2. In case of multidimensional arrays, we can easily point them out and do any kind of operations.
Java allows us to copy all or a part of components of any array to another array. In the following example we can clearly see that the first ‘String’ type array has components that do not mean anything. We can rearrange the components of that array to another array, so that it becomes a meaningful sentence while we give an output.
While we give an output, we can do the same operation in a completely different way. In the above code we have used the numerical indices to get the individual elements. The following example has used ‘for’ loop to iterate through the same numerical indices and extract the string output.
One of the most common advantages of any array is we can iterate through that array and the process of iteration helps us to get all the components out of any type of array.
So far we have learned one key concept of array. The elements of any array should belong to the same data type. On that principle, the multidimensional array also works.
Inside an array we have components that also contain one or more arrays. However, the components of multidimensional array should be of the same data type for one reason. We need to declare the data type well before the initialization.
Whatever be the nature of any array, it always works on the same principle: we get the element through the numerical index.
Another important factor we should always keep in our mind. According to the depth of the multidimensional array,we can always use the nested ‘for’ loop. Consider the following example, where we have a two dimensional arrays.
According to its dimension, we have used one nested ‘for’ loop.
In Java, we have many built-in array methods. We can import those class methods and use them to manipulate any type of arrays.
In Java, an array object is created by using a ‘new’ keyword; it is needless to say that each array object creates a reference value along with it. We can easily get that reference value just by printing it out. The next code snippet shows us the same example.
We can check whether an array has a certain value or not. We can create our own method, or we can use the built-in methods like the following code snippet.
When we pass an array as a parameter of a method, it adds a lot of flexibility in our code. It also reduces the excess code baggage. Manipulations of array by passing it as a parameter lets us do many types of operations.
Like any primitive data type, we can return any array object using a method. As always, we need to use the ‘new’ keyword.
We have given the output by accessing each numerical index.
Not only internal libraries, there are extremely useful external libraries too, Apache commons is such external libraries that help us to manipulate any array.
With the help of these external libraries we can reverse any array components; in usual case, we need to write extra code to get the same result.
The Apache commons array utility libraries have many other features, which help us to get other benefits; we can remove any array element using the external libraries.
In Java, the data type of any array may be user defined. Not only primitive or non-primitive data types, but we can also use the user defined data type like the following code snippet.
We can create and initialize more user defined array objects to make our program more robust.
Consider the next problem.
Finally, we are going to conclude this section with a special Java feature. We can take out the elements directly from the array like the following code.
In this section, we have learned many simple features of array using Java language. We can use the same algorithm for any other language as long as we use the iteration or numerical indices.
In the coming sections we will see how we can relate algebraic data set properties like the Mean, Median and Mode with arrays. We will also see how we can connect discrete mathematical conceptions like Set theory, Probability and programming conceptions arrays in our mind.
Let us start with Set theory and Probability.
Set Theory, Probability and Array
In discrete mathematics studying ‘Sets’ is mandatory. Have you ever thought why? It is because Set theory is only concerned about distinct numbers. Quite naturally it has widespread applications in Computer Science.
Literally we can translate every single property of Set theory into computer programming. And, yes, with the help of simple arrays. We don’t need complex data structures, algorithm, classes, or any collection hierarchy.
Set theory conceptions are distributed over a considerable amount in computer science as a whole. We will see that later, in detail, how Set theoretical conceptions are applied in Declarative programming language like SQL.
With all types of set operations we can make declarative statements in any SQL query. You will find typical ‘union’,‘intersection’, ‘difference’, ‘complementary’, and many more.
Moreover, we can do the same thing in our array world, also.
A Set in discrete mathematics is a list of well defined collection of unique integers. Do you find any similarity with an Array? An array is also a well defined collection of similar quantities. It could be integers, any kind of decimal values, or even Strings, characters. The main difference is an array does not always contain unique items.
As a result we can manipulate arrays in some diverse ways, and that was intended when the conceptions of arrays had been incorporated in programming languages. An array allows duplicate values, and that is important when Probability comes into pictures. We will see to that in a minute.
We can say an Array is basically a Set with indexes. Both are data structures and both contain a list of items. In particular, both have similar types of operations that can be performed, such as Union, Intersection, etc.
Before going to connect them we need to have a clear conceptions about the differences. A Set does not allow duplicate items, but an Array does. A Set does not have any index attached with its items, but an Array does. In an Array we can take out a specific item and traverse the whole Array structure until it is found. In a Set, we cannot do that.
The major advantage of an Array is we can take out any value with the help of the index.
As a matter of fact, we can conclude that they have more similarities than differences. Therefore let us immerse briefly into some code snippets that will show how we can connect them in reality.
According to the Set theory, the Union occurs between two Sets and the output omits the common value. Let us do the same with a PHP code.
In the above code, we have done most common Set operations,such as Union, Differences, Intersection, and Complement.
We can use the language of set theory to define nearly all mathematical objects, as well as every kind of possible programming operations. The major reason behind that is the Set theory deals with a diverse collection of topics, ranging from the structure of the real number line to the study of the consistency of large cardinals. We will discuss those topics, in great detail, later.
After a brief display of Set theory in programming, we will switch over to Probability, another important concept of discrete mathematical operations. Just like the Set theoretical conceptions, the Probability concepts are also applied diversely into computer science and programming world.
Before dipping our toes into code, let us discuss what exactly probability means.
Consider heads and tails. If you throw it up, there is 50-50 probability to either get heads or tails.
But the equation changes when you throw a ‘dice’ or ‘die’ up in the air.
What is the probability of getting a ‘2’ when you throw a dice? The favorable outcome is 1, as the dice has one ‘2’. On the contrary, the possible outcome is 6. Because there are 6 sides in a dice. Although a dice is a three dimensional object, we can write the 6 elements in a data set. To speak more frankly, we can create and initialize an array with the 6 integers and calculate all the probabilities.
The Probability theory is not limited to dice only, we can explore a range of huge data and calculate all types of probabilities.
Let us check the next code snippet.
In the above code, you can guess what will be the outcome. Since every element of the array is the same, the probability is 100 percent.
The Probability goes down extensively with the reduction in the numbers. When the total number of a particular integer reduces, and the number of other integers increases, the Probability plunges. Watch the next code snippet, where we have used the same code as above, but we have changed the elements of the array.
Look at the output, the Probability dips drastically compared to the before code snippets.
While calculating the Probability, we have used the Python default class methods that makes our life simpler to calculate the Probability. In other languages, we are not that lucky. Take Java for instance, we need to build some function using control constructs to calculate the same Probability.
Yet, that is interesting side of programming; we can explore many possibilities. We can find solutions to given set of a problem in many different ways.
In the above code, you can guess the outcome.
Let us change the above code a little bit by rearranging the array elements, we will get a different outcome.
The Probability plunges drastically.
The same code semantically changes when we write it in PHP 7. We can use some default class methods and try some other tricks, as well.
We have tried to calculate the Probability of outcomes using the example of a dice, theoretically we have discussed it earlier.
You can watch the outcome.
Now, just for fun, we can play around the same algorithm using a more object-oriented-programming approach. We have tried to rewrite the above code in a different way to get the same result.
The possibilities are endless and the Probability is the same.
We have seen some connections between discrete mathematical conceptions and programming algorithm. We will see more. In the next section we will explore the relation between the algebraic data set conceptions, such as the Mean, Median and Mode and complex array algorithm.
Skewed Mean, Maximized Median
The idea that National per Capita Income of a country can be manipulated, is not not new to us now. Because it is calculated on the basis of the Mean of a Set of income-inputs, one can skew it quite easily. Keeping that fact in mind, we always think that the Median is more trustworthy.
In one sense, it is true. In another, it is false.
We may think of a natural algorithm where we take income of 10 people and make a set out of it. Of those ten people,eight persons have income less than 5 dollar. But, the other two earn more than 150 dollar. The definition of a Mean of a Set says us that calculating the average of those ten inputs give us an idea of the National per Capita Income.
As a result, the average income of the country becomes nearly 45 dollar; and, we know that the truth has been crucified. Where 80 percent of people earn less than 5 dollar, it cannot be true that the average income of the citizens of that country could be nearly 45 dollar.
Theoretically, choosing the Median is more trustworthy, because the middle value comes around 4 dollar, which is more close to truth value.
Our question is how trustworthy is the Median? Can it not be skewed or manipulated at all?
In this section we will turn over the myths to find out the real truth.
Usually, in a Set of positive integers where the numbers are increasing in an ascending order, it comes out that the Mean and the Median stays close.
Consider a Set of unique and distinct positive integers, like the following one.
In the above Set of values, the Mean and the Median is almost same. In a simple Java program we can check that.
Running the code will give us this output:
In the above code, the given array was like the following:
First, we have sorted that array in an ascending order; second, we have counted the array or the set of integers as odd. For that reason, the Mean and the Median has become a whole number, not a fraction.
If the number of the values, which a set contains, is even,the outcome would be in fraction.
We can check that in another Java program.
The outcome is expected, as we have been told.
Now, we can conclude one truth value from the above observation. If the values belonging to a Set is balanced and in an ascending order, there is no difference between the Mean and the Median.
Unfortunately, the reality bites and it does not come out like this.
Consider a situation where the majority of the values belonging to a Set is increasing in an ordered fashion up to a limit. After that, as it closes to the end, it suddenly behaves in an unordered fashion; the last two numbers are fairly bigger than the rest of the numbers.
What will happen?
There will be a huge difference between the Mean and the Median.
In the beginning of this section, we were discussing about the nation’s per capita income. We were told that we could not depend on the Mean. Right? We were also told that the Median is more trustworthy, in such cases.
True.
The next Java program will show you the same example that we were told in the beginning of this section.
Watch the outcome, you will be amazed to find how different they are – the Mean and the Median. From this experience, we will tend to believe that we can have our confidence or faith in the Median.
In a country where 80 percent of people have less than or equal to 5 dollars income, calculating the nation’s per capita income using the Median is more trustworthy.
At least the above program tells us so, isn’t it?
To believe this as ‘truth’ or a ‘proof of a concept’, we need to cut into the Median. This guy ‘Median’ is not an easy guy. Apparently this fellow seems to be normal and we can think, OK, the guy Median is trustworthy.
In the series of finding the true nature of the Median, we need to start with a simple program.
As an input array or set of values we have taken an unordered numbers. We know that the value of the Median varies according to the length of the array. If the length is even, we get a Median value. If it is odd, then the Median value changes.
Therefore, we have sorted our array and check it whether that array length is even or odd.
After the sorting has been done, the larger values go the right half section of the array. In the above array, the largest value was 4, so it goes to the far right corner of the array.
Now, in the runtime, we have increased the length of the array by 3 and we call the function to find the Median.
What happens? Watch the outcome.
The Median has become larger than the usual one. It has not increased in a large way, but, we have been able to skew the Median value.
Why?
If you run the code without increasing the length of the array, and calling the function to find the Median, the Median will come out as 3. On the contrary, the Median value has become 3.5.
We are nearing to a bitter truth, the Median is not trustworthy anymore. We can skew it as we have done the same thing to the Mean before.
One thing is certain, we cannot add any number to the length of the array. It depends on the original array length. If the array length is 5, then the number we add, should be less than 5. That is, up to 4 we can add. If it crosses the limit, the Median value goes out of range.
Watch the next code:
Running the code gives us error like the following.
While we skew the Median value we should keep that simple mathematics in our mind.
In the next code, it is more obvious that maximizing the value of the Median is fairly simple. In usual scenario, in the following code, the Median should come out as 3; instead, we have skewed it and made it 4.
To maximize the Median value we have introduced a function in the above code, where the function fellow takes three parameters; one of them is the variable ‘addElement’. We can declare how many integers we want to add with the length of the array to maximize the Median.
Here is the output:
We have mentioned that we can add up to 4 elements to maximize the Median, else, it will give us the same error we have faced before.
We have added 5 elements and we have got the error, same as before.
The only way to solve this problem is to increase the array elements. When the array length gets larger, we can add more elements to that length and skew the Median size.
The next code tells us the same story.
We have increased the array length adding more elements and we can maximize the Median value more than before.
The real fun begins if in an ordered collection of integers, we add only one value that is much bigger than the rest of the elements. Suddenly, the Median becomes much bigger than we have ever imagined.
Remember, with only one bigger value we can skew the Median, and make it much larger.
In normal circumstance, in the above code, the Median should have been 4. You can find that Median value without any element to its length.
However, in reality, we have just added 6 elements to artificially increase the length and is able to make the Median 41.
We can conclude one bitter truth. The Median is not as trustworthy as had believed before.
Isn’t it?
In the next section we will cut into more complex algorithm involving array.
Complex Array Algorithm
Before starting this section, let us know that we are going to use a new programming language, called Dart. It is new compared to other programming languages, such as C, C++, PHP, Java, C#, and Python; we have used them before in this book. However, we are going to use Dart, for the first time.
Readers, who have not used Dart before, can stay calm. Dart is a language with which we can build mobile apps, as well as web applications; we can also do server side programming, etc. Dart has been created by Google, therefore, we can conclude that future of Dart is not bleak. Moreover, it has many similarities with Java, and in some cases with Python, so Java or Python programmers will adopt Dart very quickly.
We are going to use Dart to show another thing. What we can do with C, C++, PHP, Java, or Python, we can do with Dart. As a result, we will be introduced to a new general purpose language; that is a benefit.
We will also see one more thing. As any language gets updated and passes into a more better condition gradually, it starts incorporating more features. They are useful, as long as algorithm is concerned. The higher the language is,it comes up with more in-built features that shorten our algorithm, sequence of steps, make developer’s life easy.
We can code more in short time.
To reverse an array, we need to write around thirty lines of code, in usual cases. Dart can do that in one line; not only that, Dart can take that array and change it to any other data structure objects, like Set or Map.
Summing up, this book is not for learning Dart, so let us forget this part temporarily, and try to understand how we can understand various complex array algorithm with the help of Dart language. In between we will also use PHP for one example; just for a change.
It is true, array has many limitations; but, it has many advantages, too. That memory can be allocated dynamically in an array, is one of the biggest advantages. This feature of array saves the memory of the system. When memory allocation is not dynamic, the array stores the data in contiguous memory locations. What data type you are using? That determines the amount of storage required.
Granted, manipulations of an array may become complex, if you think from the perspective of algorithm; but, an array requires memory space only for the values, the start address and its length. Compare it to Linked list; a Linked List always needs a pointer for every value that is stuck in. It eats up memory for every address, and acquires extra memory for the insertion of data. The Hash table also needs extra allocation of memory.
It is true that many types of data structures need more memory than array, even so, in some cases, we need data structures. We will see those features in the next section.
The first program in Dart will help us to find the largest element in an array, after that it finds the second largest element, and, after that, it finds the third largest element. The algorithm does not stop there. It arranges those first, second and the third largest elements in descending order and gives the output.
To run the above program, we have to import the Dart math libraries.
Rotating an array clockwise, is one of the most common and complex algorithm involving an array. Actually, it rotates the array to the left by the number of elements. Suppose you have an array like this:
Now, we will rotate the above array to the left or clockwise by one element. Then it becomes like the following array:
If we rotate it by 2 elements, it becomes:
Rotating 3 elements will give us this:
From the above algorithm, we get a common pattern that we see every day. Yes, we are thinking about a clock. A clock has 12 elements. We will get to that point in a moment. Before that, we need to understand this rotational algorithm in a more efficient way. The next code snippet will give you a better idea about it.
Let us first see the output first, then we will try to understand the algorithm.
We have to create a function that will take two inputs – the array and the length of the array. It will give us output of a temporary value by shifting one element. After that we can call that function inside another function that takes three inputs – the array, the length of array, and the number of elements you want to shift to the left.
You can dynamically create any length of array and test the above code in any programming language. It will give us the same result.
Based on the same algorithm, we can now create a digital clock in PHP.
While we run the code, first it gives us the hours arranged just like any analog clock. Although the analog clock is an example of discrete mathematical conceptions, it represents contiguous mathematical conceptions that run continuously.
On the other hand the digital clock is the correct example of discrete mathematics.
Here it works as a digital clock. As you enter the amount of hours, it gives us the output accordingly.
Run the code locally, you will find the starting element in bold point.
According to the Set theory, we know how ‘Union’ of two sets work. If there is no repeated values, then they join accordingly. We can imagine a situation where, we can add two Sets A and B like this:
It will produce a third Set C like this:
We can consider them as arrays also. Two arrays, A and B. Now, mathematically we can also do some Union operations on those Sets in reverse order. It means, just as we have added A and B, we can also add B and A. And that Union yields this result:
Our next algorithm will be to find how we can change this reversing process in a different way. We are going to change (A Union B) to (B Union A) using Dart programming language.
Certainly, there are different solutions available. We follow the next algorithm.
Let us read the comments inside our code. We have clarified our algorithm there.
As we have seen before, there are several solutions available. We just need only 12 lines of code to rotate any array clockwise by one element. The next code snippet will show you how we can do that.
The outcome is quite expected. Rotate the above array clockwise just by one element, and it plucks off the last element from the end and places it at the starting point.
In this section, we have seen many array algorithm, and hopefully those make enough sense to go ahead for more. For us, in the next section, the data structures are awaiting.
Before going to data structures, we should remember one key point – they are not sequential as an array. It has advantages and disadvantages. While we cut into data structures, we will try to use the same algorithm to understand which one is more preferable contextually. No other data structure is able to save memory like array. That it quite evident. However, there are other advantages;and, we will learn them in the coming sections.
Question 1: The discrete mathematical algebraic conceptions that are known as date set is the root of Data Structures in Computer Programming.
Option 1: False
Option 2: True
Answer: Option 2
========================
Question 2: The main advantage of array is, it helps us to save the memory of the system.
Option 1: False
Option 2: True
Answer: Option 2
=======================
Question 3: Which collection of elements or data structure holds a similar “type” of elements?
Option 1: Map and Array
Option 2: Array, Set and Map
Option 3: Array
Option 4: None of the above
Answer: Option 3
=======================
Question 4: Why it is mandatory to learn Set theory in Discrete Mathematics?
Option 1: Because the Set theory conceptions are distributed over a considerable amount in computer science as a whole
Option 2: Set theory is only concerned about distinct numbers.
Option 3: With all types of set operations we can make declarative statements.
Option 4: None of the above is true.
Answer: Option 2
=======================
Question 5: What is the similarity between the Set theoretical conceptions, and the Probability concepts.
Option 1: Both are applied diversely into computer science and programming world.
Option 2: There is no similarity between these two Mathematical Concepts.
Option 3: We can explore a range of huge data or Set, and calculate all types of probabilities. Hence they are similar.
Option 4: Both deal with distinct numbers.
Answer: Option 1
=======================
Challenge 1 : Can you write a program that will find whether a particular number is there in a randomly generated data structure of integers or not.
Solution for Challenge 1:
Language used: Java
Explanation and clue to solution: Because Java ‘Math.random()’ method produces ‘double’ data type, we have to cast it to a round figure by using ‘int’ data type.
Challenge 2 : Can you reverse an ordered sequence of words? Example: sanjib to bijnas
Solution for Challenge 2:
In first case Language used: python 3.10
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
printing result
==========
In Second case Language used: Java
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
’’’’’’’’’
In the second case, with the help of these external libraries we can reverse any array components in less lines of code. In usual case, we need to write extra code to get the same result.
However, in Dart Programming langauge, it takes only one line of code.
’’’’’’’’’
In Third case Language used: Dart
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Challenge 3 : According to the Set theory, the Union occurs between two Sets and the output omits the common value. Can you write a program to prove this concept.
Solution for Challenge 3:
Language used: PHP 8
Challenge 4 : Write a program where every element of the array is the same. Find out the probability.
Solution for Challenge 4:
Language used: Python 3.10
5. Data Structures: Abstractions and Implementation
As we have said in the beginning of the book, Data Structures are the one of the most fundamental and essential building blocks of Computer Science. Any Data Structure can be divided in two distinct parts – Abstract Data Types (ADT) and Implementation.
As we progress, we will have elaborate discussion on these features; so, you need not worry at the beginning of this chapter.
We have seen many examples of Array, which is the first step to understand Data Structure. Therefore, we already know that Data Structures are basically different types of manners through which we sort, organize, insert, update, remove or display our data.
Sorting and organizing data in an efficient manner plays a very big role in constructing our data structure. Besides, we need to remember one key aspect of computation; that is,how much time the program takes, and how much memory or space it acquires.
In this part Algorithm plays a vital role. Efficient Algorithm to handle Data Structure is always needed. In this chapter, we also look at that part – time complexity, and memory allocation.
Data Structure, as a whole, is a very big topic and, moreover, every programming language has their own way to use the basic concepts of Data Structure. We will limit our main discussion to C, C++ and Java; although, in many occasions we will talk about Python, Dart, PHP and C#, as well.
To give you an idea about how the whole concepts of the complex entity like Data Structures are constructed, we should look into the Collection Interface first. In Java, the Collection Interface is divided into many branches of sub-interfaces, they are – BeanContext, BeanContextServices, BlockingDeque, BlockingQueue, Deque, List, NavigableSet, Queue, Set, SortedSet, TransferQueue.
Let us first talk about the List Interface. AbstractList class implements the List Interface. Next, three more classes – ArrayList, Vector, and AbstractSequentialList classes extends the AbstractList class. Finally the LinkedList class extends the properties and methods of AbstractSequentialList class. However, this list is incomplete.
Actually, there are many implementing classes; they are - AbstractCollection, AbstractList, AbstractQueue, AbstractSequentialList, AbstractSet, ArrayBlockingQueue, ArrayDeque, ArrayList, AttributeList, BeanContextServicesSupport, BeanContextSupport, ConcurrentLinkedDeque, ConcurrentLinkedQueue, ConcurrentSkipListSet, CopyOnWriteArrayList, CopyOnWriteArraySet, DelayQueue, EnumSet, HashSet, JobStateReasons, LinkedBlockingDeque, LinkedBlockingQueue, LinkedHashSet, LinkedList, LinkedTransferQueue, PriorityBlockingQueue, PriorityQueue, RoleList, RoleUnresolvedList, Stack, SynchronousQueue, TreeSet, and Vector.
And this ‘Collection’ interface has a super-interface – Iterable.
As you see, the list is quite long; understandably, we are not going to cover all these things in a chapter. Maybe, in a different book we can take a more detail look at those interfaces and implementing classes.
In this chapter we will look into the core Collection interface and its implementations that include Set, SortedSet, List, Queue, Deque, Map and SortedMap. Core collection interfaces are the foundation of the Java Collections Framework.
In C or C++, it is thought differently. Other languages have their own constructs.
Before discussing Data Structures, we must have some basic knowledge about how objects act upon one another. How different objects pass messages between themselves.
We will also try to manually insert data into a Linked List. This is a List that is linked to each other.
How objects work with each other
We are going to use four, easy-to-understand sample Dart program to see how one object works with other objects.
Let us talk about the first program. Suppose each person acquires a mobile application that helps them to enter their tasks. They can categorize the tasks and according to the necessity, they finish their tasks.
To accomplish such simple task, we need to have two classes – Person and the Mobile Application class. Because,one person object acquires one application object, we need an application object inside the Person class.
The application object has a blueprint or class that defines what it can do and what it cannot do.
Once a person acquires an application object, she can start doing everything with that object that has been defined in the Mobile Application class.
I hope the explanation makes sense. Each person object is a separate entity, and each application object is also separate entity. However, every separate object has some commonness, and that commonness has been defined in the class.
We can expect the output, now.
The same principle can be adopted for two separate persons. The next code snippet manifests that principle.
Here goes the output:
We can make the above code more robust and complex. However, we are trying to understand how objects work, just like other primitive data types. There is no doubt that an object is always more powerful than any single primitive data type. If you have some idea about object-oriented-programming, you know that an object encapsulates many dynamic features. An object’s power depends solely on the principle that how we have defined that object in its class.
Now, we are more curious about how we can implement this newly acquired knowledge to manipulate a simple Data Structure.
Suppose, we want to insert data into a list, and show them also at the same time. Without taking any help from array, can we do that? Can we write classes that will define such objects that will work with each other to manipulate data in a data structure?
Let us see the next code.
We have created a Node Class and with the help of node object we have successfully inserted data into that structure.
In the above code, we have inserted data manually. Can we create two separate objects that will manage and show the data?
Yes, we can try to do that in the next program.
Here goes the output, where the node object ‘list’ has successfully inserted data, and finally, gives us a display of the inserted data.
We have tried to understand how in object-oriented-programming languages like C++, Java, or Dart, these Collection interface is being implemented by the implementing classes. Now, we will understand how the core libraries and functions of the Data Structure classes and objects work together.
More Algorithm and Time Complexity
This section will be short. We won’t take much time to study a few code snippets, after that we will try to understand how much time it takes to run the program.
At the same time, we will also try to understand the underlying logic, and algorithm.
By this time we are quite familiar with the terms like algorithm and time complexity. We know that our algorithm should be efficient enough to reduce the time to run the code.
We should also remember that, any problem can have many solutions; we need to find the best algorithm.
Okay, let us try to some code in Dart. In the first case, we will try to find the factors of a positive integer. As we know, a positive integer, like 4, has three factors – 1,2 and 4. Factors are the integers that can divide the number. It always starts with 1 and ends with that number.
A prime number is a number that can be divided only by the 1 and the number itself, such as 2, 3, 5, or 7. The list goes on. To find whether a number is prime or not is very easy.
We can start the examination with a prime number like 5. All we need to check that whether all the numbers starting from 2 to 4, can divide 5 or not. If any number can divide 5, then 5 is not prime. Else, 5 is prime. Now, we can easily check whether a number is prime or not.
We will come to that point in a minute. Before that, we will find factors of a number.
Let us see the output first.
In the first part of code, we will loop through the number itself.
However, in the next code, we loop to the half of the number.
But, the problem, in both cases, we need to the number itself. So the time complexity is big O(n).
Can we reduce the time, to run the program?
Okay, we can check them with the next sample.
This time, we have succeeded to reduce the time complexity to big O(square root of n). For a big amount of integer, this algorithm fits fine.
In the next code snippets, we will find the prime factors of a positive integer. Factors of any integer can be divided to get the prime factors easily. The factors of 4 is 1, 2 and 4. We cannot divide 1 and 2. The integer 2 is prime. But, we can divide 4 and get 2 multiplied by 2.
Finally, it looks like: 1, 222. We can say the prime factors of 4 is 1 and 222, or we can also say that the prime factors are – 1 and 2 to the power of 3.
Let us check the next code snippets to find out the prime factors of any positive integer by passing the integer as parameter to a function.
As you see, in the first function, due to our algorithm, the time complexity is more. However, we have succeeded to reduce that in the second function.
In the next program, we will try to find whether a positive integer is prime or not.
We have passed two positive integers to test whether they are prime or not.
In the next section we will start discussing data structures. However, before moving into the data structures, we will see a couple of geometry algorithm, which is important to understand one key feature of data structures in C and C++.
At the same time, we will cut into discrete mathematical concepts of vector space. In Computer Science, Vectors represent a couple of things, such as lists of rows and columns, only rows; sometimes, we can place a vector point with two co-ordinates X and Y and find the direction of the point with respect to a line segment.
This algorithm is used in producing maps, directions, finding area of polygons, and many more things using Vector cross products.
In a map, how do we get the direction of a point? Should we turn right or left? We can move forward also, if that point lies on the same line.
In Geometry, three concepts play vital roles – point, line and plane.
You have probably noticed that, we are talking about these three features, in particular. To understand our points, let us see a figure (Figure 5.1) first.
Taking a look at the figure, we can say that the point C is on the left side of the line segment AB. But, is there any formula, to find that in Vector space?
Yes, there are; we can use the Vector cross products. We always calculate the position of any point placing them on the Cartesian plane. We have X and Y co-ordinates. Having the values of two co-ordinates helps us to find the direction of C from the line segment AB.
Suppose we are moving from A to B. Now we want to take turn and reach C. Should we take the left turn, or right turn? Well, the above figure tells us to take the left turn. But, how will we calculate that using co-ordinate system?
In any two-dimensional problem, like this, using Cartesian plane is the best choice, and we have adopted that for the sake of our algorithm.
According to the above figure, co-ordinate values of A is (-4, 3), and the co-ordinate values of B and C are respectively (2, -3) and (5,3).
For the sake of simplicity, if we consider that A lies on the origin O, that is, the co-ordinate values of A changes to (0, 0), then the cross products of B and C help us to determine the exact position of C with respect to B.
If the value of the cross products is greater than 0, or positive, or counter clockwise, then C is on the left side of B. If it is negative, then it is on the right hand side. Finally, if it is 0, then C is on the same line.
The Vector cross products formula is pretty simple. It is just as follows:
According to the figure (assuming A has co-ordinate values (0,0) ), the value is 15, which is positive. It proves that C is on the left side of point B. However, we have assumed that the co-ordinate values of A is (0, 0), to look it simple; which is actually not.
To do the real calculation, we need to shift A from its original position to the origin O. The co-ordinate values of A will change from (-4, 3) to (0, 0).
That will also change the co-ordinate values of B and C.
The new X co-ordinate value of B will be (X co-ordinate value of B – X co-ordinate value of A), and the new Y co-ordinate value of B will be (Y co-ordinate value of B – Y co-ordinate value of A). The same rule will be applied to the co-ordinate values of C.
In our next Java program, we have calculated that find the direction of C from line segment AB.
Take a look at the output:
We have already learned how the Vector cross products work to find the direction of the point. So, we are not discussing the algorithm, anymore. We can write our code, in a different way to accomplish a different type of task, considering that we need not shift A.
In the next code snippet, the point A is the origin, having the co-ordinate values (0, 0).
For the sake of simplicity, we have not only kept the co-ordinate values of A as (0, 0); at the same time we make the value of C same as B.
Here is the output:
We are again going to experiment with our code. However, this time we have three different co-ordinate values of A, B and C.
If you run the program, you will get the outcome as follows:
You can change the co-ordinate values of A, B and C to see how your output changes.
In the next section, we will start with the same problem to understand how data structures in C and C++ works. After that, we will start digging deep into Data Structures and Algorithm in chapter 6.
In the next section, we will have a brief introduction to the Data Structures using the same Vector cross products and finding direction.
Introducing Data Structures
We have already been introduced to data structures before. Of course, we have learned a few operations using Array in various languages, so we can say that the concept of data structures is not completely alien to us.
We need a good way to store, organize and use our data. As times passes by, the nature of data is becoming not only more and more complex, but also it’s getting bigger in quantity. More and more people are getting hooked to the Internet, exchanging huge amount of data every day, in various forms; scientific data are getting larger, we need weather data to be processed to get more accurate weather prediction, medical data are becoming humongous; this list is endless.
Therefore, we need more efficient way to sort, organize, and use that data.
Data Structures are all about this. It has a very close relation with Algorithm, because managing such huge amount of data is less tedious if we have more efficient Algorithm, ready at our hand.
While managing such huge humongous data, by sorting, organizing, or using them, one is not only prone to error, but also fail to satisfy one of the most important requirements – time and space. Yes, time complexity really matters, so the space.
In this section we have a short introduction to the Data Structures, but we will actually start the topic in the next chapter. First of all let us have a look at a figure (Figure 5.2) first.
We have clearly stated what Data Structures are, in the above figure. We can describe our data structures in our natural language; that is a part of any data structures. It is always good to write it down what we are going to do with our data structures.
Consider a static List or a fixed length List; that is, an array. It is a collection of similar type of data, either integer, double, or string. Moreover, it should have a fixed length while we declare it. Within that range of length, we can insert data, modify data at a specific position, even we can remove a data and make the memory space empty.
Whenever we declare a fixed length List, the memory manager fixes an amount of memory for that list or array. When we write, “int array[4]”, it means, memory manager allocates 4 bytes each for individual address, altogether 16 bytes are allocated.
In language like C or C++, if we want to insert one more data, in whatever position we can imagine; the task becomes tedious. First of all, the memory manager has already fixed a space. So we need to create a larger array. Copy the whole array to the new memory space and at the same time empty the old space.
To make an array dynamic, we might think an array with maximum size that particular data type allows us to use. Since the array is a contiguous block of memory, a large space remains unused. Suppose we declare an array of ‘n’ length. We start with 10 data. Next, as per our need we start adding data to the array. If we add one single data at the very beginning, we need to shift all other data by one space. If the data type is integer, we need not add 4 bytes more at the end in this case particularly because we have kept that space beforehand, after that we need to shift the other data accordingly.
Whatever we do, a large space still remains unused. In terms of memory allocation, this process does not seem friendly.
Since the length of an array is fixed, it works on a constant time.
Now, new languages come up to solve the limitations of older language. Consider the Dart. In Dart, we get two types of List. One is of traditional type – fixed length. Another is growable or dynamic List.
Enough talking, let us see the first code – fixed length List.
The output is quite expected.
In Dart, everything is object. Therefore, it is a collection of similar objects. Here it is integer. Next, we will make this list dynamic, and see how it works.
Growable Lists are dynamic in nature. We can dynamically add any number of elements
and we can also remove it by a simple method: ‘names.remove(“any name”)’. We can also use the key; as this ordered list starts from 0. So we can remove the first name just by passing this key value: ‘names.removeAt(0)’. We use the ‘removeAt(key)’ method for that operation. We can also clear the whole List just by typing: ‘names.clear()’.
Let us see the output, now.
We have a very short introduction to Data Structures. Hopefully, now we understand what are the limitations and advantages of an array. We will discuss them in detail in the first section of the next chapter 6. Comparing array with other Data Structures will help us gain more insights into this complex topic, which is the most fundamental block of Computer Science.
How Calculus and Linear Algebra are Related to this Discourse
Calculus is the branch of mathematics that describes changes in functions. Now, linear algebraic operations start with functions; moreover, computer programming cannot move a step forward without the implementation of functions.
Mathematically, we can define polynomial, rational, trigonometric, exponential, and logarithmic functions, and at the same time, we can review how to evaluate these functions, and we show the properties of their graphs.
Is the concept of function used in computer programming same as mathematical functions?
Before finding the answer, let us formally define what is mathematical function. In Mathematics, before defining a function, we need two sets A and B, where where x is an element of A and y is an element of B, is a relation from A to B.
Now, a relation from the set A to the set B defines a relationship between those two sets.
Mathematically, a function is a special type of relation in which each element of the first set is related to exactly one element of the second set.
We call the element of the first set as the input; and, the element of the second set is called the output. Functions are used all the time in mathematics to describe relationships between two sets.
Actually, when we know the input, the output is determined in a function.
We can write a function like this:
Now, the f(x) can be of any value, such as ‘x + 1’, or ‘x - 1’; in fact, with the addition, subtraction, multiplication or division of any constant value, as we change the value of x, the value of y will change.
The same thing happens in computer programming; a function returns a value. Not always; but it is a general rule that is followed by any programming language.
We may argue that not every function returns a value; there is something called ‘void’, but that also gives us what? An output.
We can find the area of a square if we know the value of one side. We can find the area of a circle, if we know the value of radius.
Mathematically, when we write y = f(x); it can also be said as ‘y equals f of x’. While writing a function like this we refer to x as the independent variable, and y as the dependent variable.
Quite understandably the value of y depends on the value of x.
Now, a function consists of three things, in particular. A set of inputs, a set of outputs and a rule for assigning each input to exactly one output. Mathematically, the set of inputs is called the domain of the function and the set of outputs is called the range of the function.
If the assigning rule of a function is f(x) = 3 – x; and the domain is {1, 2, 3}, then the value of y or the range will be {0, 1, 2}.
Not only we can visualize the function by plotting points (x, y) on coordinate planes, we can write something like this in any programming language.
The code is as follows:
Let us check the output:
We can see in the above code that the range remains at 4; however, the domain consists more than one values that point to the one output, which is 4. And yes, the domain could have infinite values depending on the permutation and combination of the value of the coefficient.
In some cases, we also need to know how calculus works in computer programming. Consider a car that should have the history of both – velocity and distance. If one is missing,we can calculate the value of the other by using calculus.
Therefore, we need to know how to find the velocity from a record of the distance. This part of calculus belongs to differentiation or differential calculus. On the contrary, we also want to compute the distance from a history of the velocity. That is called integration, and it is the goal of integral calculus. We can conclude, differentiation goes from distance to velocity; integration goes from velocity to distance.
However, there is another important factor that we should consider – time. You cannot calculate velocity without time, the rate of speed, etc. At the same time, we also want to know how much time it takes to travel a certain distance. Now, we can guess that from velocity and distance, we can also calculate time.
Let us consider the following code snippet, where we have calculated these factors.
Here is the output:
Now we are ready to discuss data structures in detail in the next chapter.
In the first part of code, we will loop through the number itself.
‘’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’
However, in the next code, we loop to the half of the number. So the time complexity is big O(n).
‘’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’
However, in the next code snippet, we have succeeded to reduce the time complexity to big O(square root of n).
‘’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’
Therefore, for a very big integer, the above program will take less system resource.
‘’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’
Challenge 2 : Write a program where from velocity and distance, we can also calculate time. Write down how different fields of Calculus works together.
We have learned how to find the velocity from a record of the distance. This part of calculus belongs to differentiation or differential calculus. On the contrary, we also want to compute the distance from a history of the velocity. That is called integration, and it is the goal of integral calculus. We can conclude, differentiation goes from distance to velocity; integration goes from velocity to distance.
Challenge 3 : Handling Data Structures in high-level language beomes easier as mew language comes up. Can you prove this theory with an example?
Whenever we declare a fixed length List, the memory manager fixes an amount of memory for that list or array. When we write, “int array[4]”, it means, memory manager allocates 4 bytes each for individual address, altogether 16 bytes are allocated.
In language like C or C++, if we want to insert one more data, in whatever position we can imagine; the task becomes tedious. First of all, the memory manager has already fixed a space. So we need to create a larger array. Copy the whole array to the new memory space and at the same time empty the old space.
However, in Dart, Growable Lists are dynamic in nature. We can dynamically add any number of elements and we can also remove it by a simple method: ‘names.remove(“any name”)’. We can also use the key; as this ordered list starts from 0. So we can remove the first name just by passing this key value: ‘names.removeAt(0)’. We use the ‘removeAt(key)’ method for that operation. We can also clear the whole List just by typing: ‘names.clear()’.
6. Data Structures in Detail
In this chapter we will discuss every facet of data structures.
Frequently Asked Questions about Data Structures
We show the outline of Frequently Asked Questions about Data Structures in the following way:
While we broadly categorize the above data structures we have kept Java in our mind. It changes with the programming languages. Of course for C it is quite different. However, once you understand the basic concepts,you can transplant the Java code into C++ or vice versa.
Earlier we have seen that any data structure is a way of storing and organizing the data so that it can be used efficiently. It provides a large amount of data efficiently. Efficient algorithm can only be designed with efficient data structures.
Abstract Data Type (ADT)
The first step of designing an efficient data structure is to develop an efficient mathematical model for the data to be stored. After that, we need to choose the methods to access and modify the data. This model with the methods is called Abstract Data Type or ADT.
Through the ADT, we address all the functionalities of the data structures. It tells us what we want to do with the data structures. However, ADT does not tell us anything about the implementation process, memory management, or the algorithm we implement for the data structures.
As we have said earlier, efficient algorithm depends on efficient data structures, we will definitely be interested on which algorithm we should implement; however,in the ADT stage, it is not necessary.
As an example, if we are implementing a dictionary ADT, we may want to implement a “search(word)” method. At the very beginning of the project, we have to specify that; and, what we are doing in that stage is nothing but ADT.
Now, in case of Java, a data structure for implementing an ADT is a structured set of variables for storing data. On the implementation level, an ADT corresponds to a Java Interface, and the data structure implementing the ADT corresponds to a class implementing that interface. We will see to this part in the later section of the book, where we will discuss ‘Collection’.
Linear Data Structures
In Linear Data Structures we can arrange elements sequentially so that one element may have next element and the next element may have next elements, and we can extend the sequential order as long as we could extend it.
In this section, our mathematical model of the data is a linear sequence of elements; this sequence has well-defined ‘first’ and ‘last’ elements. Every element of a sequence except the first has a unique predecessor while every element except the last has a unique successor. As an example, consider a String ‘good’; here, the characters are ordered sequentially. The character ‘g’ is the first element and has no predecessor. The character ‘d’ is the last element and has no successor.
The next figure (Figure 6.1) shows the three layers deep linear data structure samples.
We can show this linear and sequential data structures using the following code snippets in Java.
As the topmost layer or the outermost layer has three boxes, we can arrange the output in three different ways as follows. The first output goes this way:
The sequential order is rearranged if we change the index of the outermost layer from 0 to 1. Watch this part of the above code snippet:
And the rearranged output is like this:
We can change this order three times as the outermost layer has three indices or we can imagine it as boxes. The last sequence looks like this:
Modeling of a Structure
In the Abstract Data Type part, correct modeling of a structure is extremely important. We need to understand why it is important.
In any programming language when we write a code, we actually write some instructions for the computers. Here, human communication through programming language also plays a crucial part. Here when we say humans, we mean programmers. Another programmer will also read the code, and programmers are not computers.
Just because a compiler can understand a given construct there is no guarantee that a programmer can also understand that construct. Therefore, the ADT should be clear and concise. Moreover, it should be human readable and understandable.
Because a human mind has many limitations, any code should be elaborately documented so that the purpose of using any particular algorithm is clear and graspable. Consider a stack algorithm, which handles nested constructs for a compiler. If it is very complicated, human mind cannot follow the structure. It is especially true for the data structures. Deeply nested constructs in a data structure can be incomprehensible; the limitations of human mind cannot comprehend it.
While a construct is ambiguous to a human, at the same time it is clear and comprehensible to a compiler.
Consider a simple division algorithm.
Although humans cannot follow the construct of division without parentheses, nevertheless the compiler does not complain. It gives us the correct output.
If we wrote this line this way adding the parentheses :
The limitations of human mind would not deter it from comprehending correctly.
ArrayList to overcome limitations of Array
Working with array is difficult because they have a fixed size and it is not very easy to add or remove data, always. Arrays are sequentially ordered; for that reason, adding or removing elements from any position between two elements will also involve shifting other values. The whole operation makes the processor overwork, forcing it to overwork to solve the problem.
While the ArrayList is also sequentially ordered, and processor takes the same steps as it does in case of Arrays, nevertheless it is easy to handle because of the ADT.
ArrayList is an ADT that provides a generic class, which has many useful methods to deal with a collection of data. It also supports different data types. Using the ArrayList methods we could easily add, remove, modify any element. Moreover, we could count the size of the list and based on which we could use the looping construct.
An ArrayList is declared as follows:
Here T is the data type, not the primitive data type, but the Wrapper Class. Let us take a look at some implementations.
We have added a few elements and the output is quite simple.
Next, we want to remove some elements and give the output to see how it works.
Here goes the output:
Next, we are going to modify the first element of the list; we will modify it changing ‘index’ to ‘home’.
We have successfully modified the value. Running the program gives is this output:
After storing the data, the need to sort them is extremely important. In the final ArrayList code snippet, we will sort our data structures.
In the above code, we have displayed the unsorted output first, after that, we have given the output of the sorted list.
We have seen some examples of ArrayList. In the next section, we will see how LinkedList works. We will also learn why LinkedList consumes less memory than ArrayList.
ArrayList or LinkedList, which is faster?
As we have seen that the two most natural ways of storing sequences in computer memory are arrays and linked lists. ArrayList is an ADT that uses all concepts of handling Arrays with more flexibility. That is why, the time complexity is same for the both.
In LinkedList, it does not work that way. We usually model the memory as a sequence of memory cells, each of which has a unique address. An array or ArrayList is a contiguous piece of memory. Each cell of the memory stores one object, which is a part of the sequence stored in an Array or an ArrayList.
In a singly LinkedList two successive memory cells are allocated for each object of the sequence. Two memory cells form a node of a sequence. The first cell stores the object and the second cell stores a reference to the next node of the list.
In the next node there is two memory cells, and the previous node points to the first memory cell of the next node.
In a doubly linked list, the mechanism changes. Then we not only store a reference to the successor of each element, but we need to have a reference to its predecessor. It means each node should have three successive memory cells. We will discuss it in detail later.
At present, we will see how a singly LinkedList works.
In the above code we have implemented the ADT of a singly LinkedList, where we have only added or inserted next node until it reaches the NULL value.
Now, we have an introduction to Data Structures; we will dig into this matter more in the coming sections.
Collection Framework in programming languages
Every programming language has its own Collection framework. For brevity, in this chapter, we will use two programming languages in particular; they are Java and C++. However, the core ideas are same and implemented widely in every language; we will see to it later.
First of all, we need to understand one thing first. In this book, we are going to learn data structures and algorithm, because they are inter-dependent; and, they are also related to discrete mathematical conceptions.
All together, it is a burden of huge responsibility to organize the data structures and arrange the necessary algorithm to do every kind of operations possible to make an application run successfully. Now, if we get busy in low-level plumbing to organize the data structures using necessary algorithm from the scratch, we cannot concentrate on the other important parts of our programs. Collection framework helps us to get rid of that heavy load of low-level plumbing.
Therefore, every popular high level programming language has its own Collection framework.
As far as Java is concerned, the Collection framework, as an unified structure, consists three core parts; they are interfaces, implementations and algorithm.
Interfaces, as always, represent abstract data types; in object-oriented languages, interfaces generally form a hierarchy and its implementations depend heavily on the data structures and algorithm. When an object implements Collection interface, it uses some methods to perform useful computations, such as searching and sorting. These methods are called algorithm.
We need to understand another core conceptions regarding algorithm; it is polymorphic. It means, we can use the same method on many different data structures.
In a moment we will find how it works.
Before going to see the polymorphic implementations of algorithms on different data structures, we must be aware of the benefits of Collection interfaces. In a nutshell, they reduce programming efforts, as we have discussed it earlier, you do not have to do the low-level plumbing. This framework gives you enough freedom to work on other important parts of program, because it supplies high-performance and high-quality implementations of useful data structure and algorithm.
Enough talking, let us see some code snippets to buttress our theory.
Stack and Queue in Java
Stack and Queue are abstract data types or interfaces that extend Collection interface in Java. They have more or less similarities; except the core concept where we remove the elements.
Stack uses ‘last in first out’ or LIFO method. It means, when we use the remove method, the last element. that has just been inserted, will be popped out first.
In Queue, the opposite happens. It works on ‘first in first out’, or FIFO algorithm.
In this section we will mainly concentrate on Stack and Queue interfaces and on their implementations. In the coming section, we will learn about Deque; it is a double-ended-queue, a linear collection of elements that supports the insertion and removal of elements at both end points. It is the main advantage of Deque. For that reason,the Deque interface is considered to be a richer abstract data type than both Stack and Queue; quite naturally, it implements both stacks and queues at the same time.
Let us start with some Stack algorithms first. In the following program, we will see all the necessary algorithm of stacks.
In the output, we will see how stacks work.
Although we do not have to build the Stack algorithms from the scratch, we can have a try to understand the inherent mechanism.
Watch the output; we have succeeded to create our own stack class.
Now we can add more functionalities to our Stack class. Watch the following code snippets.
We have added new elements and also set the limit of the stacks. We have also tested whether the stack has been overflowed or not.
We have found that simulating Java’s core algorithms is not difficult, we can add more functionalities to our Stack class. We verify the limit to check that no new element can be added.
Here goes the output:
The Queue interface and its implementations are different, although the algorithm we use have the similarities with the Stack interface.
Let us see one simple Queue implementation, where we have added a few elements. We have kept the code and the output at the same place.
We can check whether a Queue has any element or not using this algorithm.
We can convert an array to a queue and use all the queue methods to manipulate that array. This is a big advantage of any Collection framework; because, we can always do this type of conversions.
As always, there are same types of algorithms with the different types of algorithms.
Just like Stack, we can also set the limit of a Queue. Crossing that will give you an exception.
We can create our own Queue class by implementing the Java Queue interface; however, in this section we are not going to do that. In the coming section, we will check the Deque Abstract data type and see how it uses Stack and Queue at the same time.
Deque, a high-performance Abstract Data Type
We have seen the implementation of the Deque interface before; the Deque is pronounced as ‘deck’. In Java, one of the general purpose implementations include LinkedList. Another is ArrayDeque classes. In terms of efficient operations and flexibility, they can be compared. We will see to them in a minute.
Before that, we will quickly go through a Python code where we will create a double-ended queue or a deque. We have seen an examples of queue, the ordered collection of items. In deque, we have the ordered collection, which has two ends – a front and a rear. One characteristic has made this interface unique in nature – there is no restriction in adding and removing items. New items can be added, either at the front or at the rear.
The same way, we can either remove any item from the front or from the end. In a sense, this hybrid linear data structure provides all the capabilities of stacks and queues under one roof.
Although, it has combination of stacks and queues, it never assumes the LIFO or FIFO orderings that stacks and queues usually posses.
In the following Python code, we have created the Deque class that follows the guideline provided by the interface used in Java.
The output will tell the story in detail, how we have tested whether the Deque collection is empty or not. We have also added and removed elements at the either ends and shown the output.
In Java, we have performed the same operations, using the ArrayDeque class.
The output is almost same as the Python code we have seen before.
The next example in Java implementing ArrayDeque class has not done anything new, except that we have given the output in a different way.
It is evident in the output.
As we have said earlier, the ArrayDeque and the LinkedList classes implement Deque interface in different manner. The LinkedList class is the ‘list’ implementation of the Deque interface; whereas, the ArrayDeque class is the resizable implementation of the same interface.
The basic insertion, removal and retrieval algorithms consist of ‘addFirst, addLast, removeFirst, removeLast, getFirst and getLast’ methods. As the name suggests, the ‘addFirst’ method adds an element at the head whereas the ‘addLast’ method adds an element at the rear.
The ‘null’ elements are allowed in the LinkedList, but in ArrayDeque, it is not allowed. LinkedList implements all ‘list’ operations, which adds more flexibility to it.
However, if we compare the efficiency, ArrayDeque is more efficient than LinkedList; because using ArrayDeque, we can add and remove at both ends. Moreover, LinkedList is not a good candidate for iteration.
Again, the advantage of LinkedList is during the iteration,we can remove the current element. According to the size complexity, the LinkedList implementation consumes more memory than ArrayDeque.
As long as data structures are concerned, these implementations are data structures and they always come with their own algorithm.
Here context plays a great role and according to the contexts you should choose the data structures. Comparisons are always there, in every programming language, based on the context, one data structure is preferable than the other.
In C++, the ‘std::deque’ is considered to be a container that allows fast insertion and deletion at the both ends. The advantage in C++ is, when we add or remove any element at the beginning or the end, it does not make pointers and references invalid to the rest of elements.
Let us first see an example of deque in C++.
We have created a deque container that contains integers and after that, we have pushed on integer at the beginning and one element at the end.
Therefore we have got this output:
Now, we can look at this C++ code more closely. We can automatically expand and contract the storage place as needed. In C++ ‘std::deque’ is compared with ‘std::vector’,because expansion of deque is cheaper than the expansion of ‘std::vector’. The advantage of ‘std::deque’ is it does not copy the existing elements to a new memory location.
The following example shows how it happens.
Depending on the resizable nature of expansion and contraction, the above C++ program gives us the following output:
Because the ‘std::deque’ is an indexed sequence container, it is extremely fast in insertion and deletion process.
Moreover, the above example shows us one key aspect of deque in C++, the resizing process, or the insertion and deletion processes does not have any effect on pointers and references to the rest of the elements.
In the mext chapter, we will discuss algorithm and data structure in more detail; stable object-oriented programming languages like Java or C++ always provide reusable functionalities that are known as polymorphic algorithm.
Question 1: The ArrayList is also sequentially ordered, and processor takes the same steps as it does in case of Arrays.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 2: What is Abstract Data Type (ADT)? Is it related to Data Structure?
Option 1: The ADT is an abstraction of data structure.
Option 2: There is no relation between ADT and Data Structure.
Option 3: The model with the methods to access and modify the data is known as ADT. It is related to Data Structure.
Option 4: None of the above statement is true.
Answer: Option 3
=======================
Question 3: Every programming language has its own Collection framework.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 4: Working with array is difficult because they have a fixed size and it is not very easy to add or remove data.
Option 1: The above statement is correct.
Option 2: The above statement is true and ArrayList helps us to overcome this limitaions.
Option 3: The above statement is partly true.
Option 4: The above statement is completely false.
Answer: Option 1 and 2. Both statements are true here.
=======================
Question 5: LIFO is the exact opposite algorithm of FIFO.
Option 1: True
Option 2: False
Answer: Option 1
Challenge 1 : Can we test in a program whether the Stack has been overflowed or not?
Solution to Challenge 1:
Language used: Java
Challenge 2 : Can you convert an array to a queue and use all the queue methods to manipulate that array?
Solution to Challenge 2:
Language used: Java
Challenge 3 : Why LinkedList consumes less memory than ArrayList. Write a program and explain why it happens.
Solution to Challenge 3:
Language used: Java
’’’’’’’’’’’’’'’Explanation’’’’’’’’’’’’’’’
In a singly LinkedList which we’re watching above, two successive memory cells are allocated for each object of the sequence. Two memory cells form a node of a sequence. The first cell stores the object and the second cell stores a reference to the next node of the list.
In a LinkedList, we usually model the memory as a sequence of memory cells, each of which has a unique address. On the cntrary, an array or ArrayList is a contiguous piece of memory.
For that reason, the time complexity is higher for the both, an array and an ArrayList.
Challenge 4 : Can you give examples of hybrid linear data structure that provides all the capabilities of stacks and queues under one roof?
Solution to Challenge 4:
Language used: Python 3.10
’’’’’’’’’’’’’'’We can convert the same code to Java, that implements Deque interface’’’’’’’’’’’’
// The above code has not done anything new, except that we have given the output in a different way.
7. Algorithm, Data Structure, Collection Framework and Standard Template Library (STL)
Algorithm is expressed as a set of steps. By describing the actions at each step we instruct the computer do something. Usually we can use any natural language to describe the actions to perform at each step.
Consider this simple description.
In C++ programming language, on one hand, we can create generic functions to find the maximum or minimum value; and, on the other, we can take help from the ‘algorithm’ template library to find the same values.
Every high level language comes with its own algorithm library. They do so for one reason. Any system of counting or calculation by means of a device like computer involves following a steps or directions. Computer scientists use the word ‘algorithm’ to describe such as ‘set of directions’.
In some cases, these directions could be simple as described above. In most cases, it is much more complex. For complex cases, we need the help of ‘algorithm’ library. Otherwise, we have to do the low-level plumbing, which is much more time consuming and that takes us away from building other important parts of any application.
Historically, the derivation of this word has some interesting facts.
At the beginning of ninth century a mathematician wrote a book called ‘Kitab al jabr w’al muqabala’ (Rules of Restoration and Reduction). The word ‘algebra’ comes from the title of the book. This textbook introduced the use of Hindu numerals and included a systematic discussion of fundamental operations on integers.
The word ‘algorithm’ comes from the name of the mathematician, Abu Ja’far Mohammed ibn Musa al-Khowarizmi.
One of the most famous and well known algorithms is of course Euclid’s Algorithm. It is a process for calculating the greatest common divisor (GCD) of two integers.
We can illustrate this algorithm in the following way.
Notice that the algorithm is expressed as a set of steps. Each step describes some action to take. The important thing is to describe the actions to be performed clearly and unambiguously.
Let us summarize this introductory part on algorithm in one sentence.
Data go inside the computer as inputs, algorithm takes charge, processing the data and after that the data as outputs come out.
By the way, people often mistake the word ‘data’ as singular; but, it is actually a plural form of the Latin word ‘datum’. Since we have used this word too often in our discourse, and will use in future, therefore, for the curious readers I opened up the Oxford dictionary and searched for the word: datum.
Oxford dictionary defines datum as “A thing given or granted; a thing known or assumed as a fact, and made the basis of reasoning or calculation; a fixed starting-point for a series of measurements etc.” It has also made it clear that the plural form of ‘datum’ is ‘data’.
For instance, in Java we have Collection class and in C++ we have containers that manage this data structure part.
We are going to find out how they are related to algorithm.
Introducing Algorithm Library
Now, we have an idea about how algorithm works. For a computer, it is ‘set of steps, or directions, or instructions’. For a chef it is a recipe.
Is not it?
In real world, when somebody asks directions to go to a certain place, we always try to help by giving that person a set of directions.
Right?
In the Google map, same thing happens, but in a different way.
There are trillions of algorithms working worldwide, may be more. As time passes by, it will increase. Quite naturally; because the volume of data increases; we need to structure those data in a more organized way. So we need things like ‘container classes’ in C++ or Collection framework in language like Java. We will discuss them in great detail in this chapter, along with algorithm, and discrete mathematics.
Moreover, it is clear that to avoid low-level plumbing for a huge volume of data we need algorithm libraries. For a small set of data we can manage it by manually, but for a IT product company, it needs very specialized algorithms, to put it eloquently, very complex algorithms that will deliver their complex products successfully.
Let us see two code snippets in C++ to understand why we need algorithm library. It is a component of Standard Template Library (STL). It provides many generic versions of standard algorithms that replace our low-level plumbing.
The first example shows us a simple program where we take two integers from the users and gives the output of the maximum and minimum values.
Running the program prompts us to give two integers. We enter two integers, and we get the maximum and the minimum values.
However, we could have tackled the same problem with less lines of code, if we used the C++ algorithms libraries.
Let us see the next code snippets.
Adding the algorithm header file on the top of our code makes all the difference. Now we can use the max() and min() methods; the libraries take all the load of low-level plumbing. With the less lines of code we get the same result.
In this particular point of our discourse, we need to understand one key concept. Every high level language tries to solve the same problem in their own way using their own framework or libraries.
There are some common algorithmic terms, such as ‘sort’, ‘shuffle’, or ‘search’.
In Java, the algorithms come from the Collection class and the great majority of the algorithms provided by the Java platform operate on List instances. A few of them also operate on arbitrarily chosen Collection instances.
Let us see one example where we sort a list of alphabets in ascending order.
The algorithm described above takes the form of static methods whose first argument is the collection on which the operation is to be performed. Running the above code gives us the following output:
When the volume of the list is small, we can do the low-level plumbing, although it is wise to take help from the Algorithm Library in case of a very large volume of data.
Watch the following code snippets:
Here goes the output with the elements in ascending order.
The above program shows us reordering a List so that its elements are in ascending order according to an ordering relationship. However, to do that we need to hard code from the scratch. It is not necessary. We could have used Java Collection framework and manage to do the same operations by the following code snippets.
We can do the same thing in C++ programming language. C++ Standard Template Library stands between algorithm and containers (data structure) and manages them wisely. We will discuss them in great detail in this chapter. At the same time we will dig deep into the Collection Framework of Java; as Java does the same thing in its own way.
Comparing these two great programming languages we will have a better understanding of how algorithm and data structures are related.
Let us see the same code snippets in C++.
As you see in the above code snippets, the std::sort() method by default takes two arguments-the beginning point and the end point. After that, it puts the list in ascending order. We do not have to reinvent the wheel as we did in the Java code snippets (code 7.4). Besides putting a collection of integers in ascending order, we may turn the order inside out. C++ Standard Template Library lets us do that by passing another argument.
Therefore, we get the following output, where the unordered list is ordered in ascending and descending orders both.
Hopefully we have understood the basic conceptions regarding the algorithm libraries. In this section we have also learned how algorithm and data structures are related.
We will learn more about this in the coming sections.
Different types of Algorithms
There are many types of algorithm; as intermediate learners of computer science, you may have heard about them, and probably used some of them. In this section, we are neither going to learn them by heart, nor we will discuss them one after another.
We simply cannot do that. Even we could do that, we would not even try to do that because most of the examples are available in open source, and they are all over there in the internet.
In my opinion, the best thing we should do is to understand the core conceptions of different types of algorithms, and after that we can try to apply them to solve different types of problems.
We are not going to define algorithm again, we have already learned that ‘set of directions’ or ‘set of steps’ is called algorithm. It is true for everything; as long as algorithm is concerned, humans and computers look utterly alike. They all need directions to do something meaningful.
Some of the better known algorithms are Recursive algorithms, dynamic programming algorithms, Brute Force algorithms, etc.
There is no eternal endpoint for learning algorithm; therefore, especially for basic cognitive process to pick up algorithm, the learning curve is really steep. A steep learning curve will always try to shed you from its roof, just like a steep roof sheds snow; if you want to go the top, and want to master the art of writing your own algorithm, you need to work hard to solve different types of problems.
Recursive Algorithm
Semantically, when we say that a function calls itself directly or indirectly, it is called recursion. However, if we want to put it in an algorithmic way, we should write that ‘it is a set of directions by which we want to divide a problem and conquer’. We can put it in more eloquent way, ‘decrease the problem and conquer’.
We use one function to call itself, and the corresponding functions are known as recursive functions. But there is a drawback or difficulty that is not evident.
First of all, when a function calls itself, it will call itself endlessly if we do not stop it. Therefore, there should be a mechanism to stop it. Otherwise, it might make things look like eternal looping and cause run time error. We need a base case, so that the mechanism called ‘calling itself’ should progress towards the base case.
The second most important things are ‘space, speed and time’. When a function calls itself again and again, it takes place in the stack. The final code or program might end up as a slow program. We will discuss the pros and cons after we get our head around the recursive algorithm a little bit. Let us try to understand how it works, first.
Suppose using a function we want to print 2 and 1. We can do that two ways-one is iterative way, using loop construct. The another way is the recursive way, allowing the the function to call itself. Consider the code snippet below.
If we call the method “printNumber(2)”, what is going to happen? Let us try to understand the core conception of recursion. When we call the above method, passing the integer parameter 2, three clones are made. The original call should give us 2 as output. But it has made a recursive call creating a clone of the function as the value of ‘n’ is now equal to 1. This call should give us 1 as output. After that, it makes another recursive call and makes the value of ‘n’ equal to 0.
However, that is our base case, and we are making progress towards that base case where we have made a condition so that it goes away and stops calling the function. When we reach the base case, that is the output of ‘n’ is equal to 0, the ‘if’ condition is true and it just returns.
If there were no ‘base case’, the stacks would be overflowed. There would be a run time error. The ‘if’ condition or the base case prevents the recursive call from being made again and again.
In our following code snippets we will see some simple examples of recursive functions; then we will move towards the lesser known world of recursion to the unknown world of recursion, solving more complex problems.
The first C++ code snippet will give us a glimpse of recursion in its simplest version. We will move forward to the base case from a given number, and at the same time, we will rearrange the order.
We will move from 3 to 1 and vice versa. Here is the output:
In the next code snippet, we will manipulate the output by squaring the numbers; it is a test case, it shows that you can do many types of manipulations for your own advantage.
The output is quite expected. We have the squared values of the output.
The manipulation of integers could cause run time error and make the stack overflow if your base case is wrong. Therefore, that should be the most important part when you call a function recursively.
We can call a recursive function indirectly, too. The next code snippet in C++ shows you a simple example.
While calling another recursion, we can test whether that integer is even or odd. Based on that, we can make some more recursive calls.
Probably you have already noticed that the fee levied for the use of stack memory is enormous. It happens for the memory allocation and re-allocation. When any function is called from main(), the memory is usually allocated to it on the stack, for the recursive calls different copy of local variables or clone is created for each function call. This process continues till the process reaches the base case.
When the same problem is solved using the iterative methods, it consumes less memory, but usually the length of code is bigger than the recursive one. However, for a small problem like finding factors, we cannot even feel the difference.
Both the functions give us the same output. Whatever number you pass through the functions, it will give you the same output. There is only one difference. The same thing does not take place in the memory region.
Some problems, such as tree traversals, or Tower of Hanoi is inherently recursive. Of course the same problems can be solved iterative way with the help of data structures. If you compare the lines of code then recursion takes less and looks cleaner.
In the next problem, we will find the prime factors of any integer. Usually any prime number has only two factors-1, and the number itself. For that reason, they are called prime numbers. Other integers has more than one factors.
Consider a number-6; it has factors, such as 1, 2, 3, and 6. The integer 6 is divisible by all those factors. In this list, not all factors are prime. Only 2 and 3 are prime factors.
In the next code snippet, we will find only the prime factors of any integer, using recursion.
We have used a loop counter to call the function recursively. We could have written the same code in shorter space if we did not use the loop counter. Instead we could have pass another parameter through the same function.
As you have noticed, to prevent the infinite recursion, we always provide a condition, or base case. The next code snippet, in Java, will give us factorials of any integer.
By using comments, we have pointed out the termination call.
To get the factorial of 4 we do not have to write a long line of code. It has been managed in a shorter space. However, as the number gets bigger, the output acts like a beggar. The ‘int’ data type has a limit. It is overflowed.
In that case, we can use the ‘long’, the bigger version of ‘int’ data type.
With the help of recursion, we can solve such problems quite easily. Moreover, we need to write less line of code than we write when we solve the same problems iterative way.
We need to understand one key concept here. Whenever we call a function an active record is maintained. The active record of each call includes spaces in the stack for many further operations. Remember, between calling a function and returning the value, there are several moments; parameters of methods, local variables, returned addresses;so many things are there. And they all want spaces in the stack. When a function is called, its active record is pushed into the stack, but after that, many things happen.
When a function calls recursively, the stacks get busy with many such operations. This overhead of many operations in the stack, makes any recursion slow. It also uses more memory. Keeping all these barriers that impede free movement of memory, we still need recursion.
Why? We will see in a minute.
We are going to compare two code snippets one after another. In the first one, we get the factorials using iterative way.
Now take a look at the same code, written in recursive way.
The advantage of using recursion is its shortness, cleanliness, and moreover, it is closer to our discrete mathematical definitions. If you think and model your code keeping the mathematical conceptions in your mind, then recursion is close to your definition.
It is more evident when we use recursion in finding the Fibonacci series. Mathematically, the Fibonacci is defined as the following:
If we want to find the Fibonacci series using recursion, it is not only the simplest version, but also mathematically similar.
Watch the next line of code snippet.
We can do the same thing using iterative way, but it does not reflect the mathematical conception as the recursion does. The next code snippet shows us the same thing.
Ability to simulate the mathematical models cannot always give the moral support to the recursions. Because there are many other factors while we program, we need to keep them in our mind.
Maintaining that the recursive versions are slower than the iterative versions, we may still want to adopt it for some reasons. When the code is heavy in iterative versions,it is much simpler to adopt the recursions, it also easier to debug and maintain. Whatever our reasons to adopt the recursions, slowing down the program may cost at the end.
We cannot save the memory space, we cannot speed up the execution; yet, in some cases, recursions are essential.
As we progress, we will find that later.
Binary Tree and Data Structure
Binary tree is a specialized representation of data structure. Our main purposes of studying data structure is to organize data in the most efficient manner. Binary tree is used for data storage purposes. A tree is represented by nodes that are connected by edges or pointers.
A binary tree has a special condition, which has also made it very special among other data storage mechanisms. A node of binary tree can have maximum two children. When it has one or two children, we call it a sub-tree. Therefore, each node of a sub-tree might have more sub-trees.
When a node of a tree or a sub-tree does not have any children, it is called leaf node.
While traversing a tree, we always take the leaf node as our end point, or base case for recursion. We can search a binary tree as an ordered array. Besides, we can insert and delete data in any binary tree just like a LinkedList.
As we have just said, a tree and its sub-tree always create a sequence of nodes; this sequence is known as path. This path starts from the top of the node, which is referred as root. Always there is only one root and one path to the other node. It has no duplicate path.
The nodes that are placed below, are called children; and, these children nodes always have nodes above them, which are known as parent nodes.
If we consider a root node as the parent node, then it is said that this node is on the level one. The node just below is placed at level two.
Finally, a tree can be traversed in various manners; they are pre-order, post-order and in-order.
There are different algorithms for each kind of tree traversal. In this section, we will take a brief look at those algorithms of tree traversal.
Tree Traversal Algorithm
As we have just learned, in some cases, recursions are essential. Especially when most of the tree based algorithms are concerned, they can be easily implemented using recursion because a binary tree is a recursive data structure. Although it is wise to learn solving the same problems using without recursion; you can solve the tree based algorithms using iteration, also.
In this section, we will learn tree traversals using recursion and using iteration, both. As the tool of our learning, we have chosen Java as programming language.
In the following sections, we will take a brief look at other algorithms as well.
A hierarchical tree structure usually represents a set of linked nodes. We can associate an ADT or abstract data type as it simulates the tree structure. It has a root value and the sub-tree of children may be connected with the parent node.
First of all, let us think about an image of tree structure to understand the concept. We will use this image in the following code snippets also.
In the above input section, the collection of nodes start with the root node. Each node is connected with a list of references to children nodes.
This tree structure can be defined recursively; because each node is a data structure consisting of a value that generates the list of references to children nodes.
Although there are other methods to do the binary tree traversal, but the most popular ways to traverse the trees in Java are the pre-order, post-order, and in-order traversal. In this section, we will take a detailed look at each of the traversal methods.
Generally, when you traverse a tree you have three choices-root, left sub-tree, and right sub-tree. In which order you will traverse the tree decides the nature of the algorithm associated with it.
By default a binary tree is a recursive data structure; firstly, it has similarities with LinkedList, which is also a recursive data structure; secondly, if you remove a node, rest of the structure is also a binary tree like left and right binary tree. For that reason, when a function calls itself, it is easier to traverse from one node to the other.
In this section we will limit our discussion to three types of binary tree traversal algorithms; they are pre-order, post-order and in-order.
A tree can be traversed by our algorithm in several ways. In pre-order traversal, it goes this way: root, left and right. In post-order traversal, it goes this way: left, right and root. The in-order traversal traverses the nodes this way: left, root and right.
We have seen how LinkedList works, therefore creating nodes is not difficult. We can start visiting the root node first, then we can move towards the left sub-tree and after that we can proceed towards the right sub-tree.
We should start with the Node class, because a tree has nodes. After that we might use recursion to generate more nodes. The advantage of node is it has left or right pointers that point either to the left side or right side.
To do the pre-order tree traversal, we can use a method like ‘preOrder()’ that will call itself by passing the node object; you can give it any name.
Pre Order Traversal
Let us start with a pre-order tree traversal code snippet, which will give us a clear picture how we can implement our algorithm. By dissecting that code, we will learn how pre-order tree traversal algorithm works. In the first code snippet, we will see how we can traverse the tree without using recursion. We start traversing the tree with the root node, after that we visit the left sub-tree and next, we visit the right sub-tree.
Each sub-tree is also visited pre-order way. That means it starts from the root of the sub-tree, then it goes to the left sub-tree and the above process continues until we reach a leaf node.
The pattern of traversal tells us one thing very clearly; this tree traversal is a good candidate for recursion. After visiting the root node, we can recursively visit the left sub-tree and then, we can visit the right sub-tree recursively again.
Still recursion is not the only way. We can do the traversal using iteration also. The following code snippet shows that example.
The output is quite expected here, in the comments section we have declared how it will look like.
In general, recursion implicitly uses a Stack data structure. In recursion when we reach the base point, it starts unwinding. For that reason, in the above code, we have used the Stack data structure to traverse the tree without using recursion.
In a tree, the leaf node is the base point. Reaching that point node becomes null, and we reach the base point.
To simulate the recursion we need to apply Stack explicitly, instead of implicitly. Let us watch the above code snippets to understand what happens inside.
First, this part where we have defined and declared the static ‘TreeNodeClass’:
We want three ‘node’ objects, and we name them as ‘left’, ‘right’, and ‘root’. We also need a ‘String’ data type variable to store our values. The advantage of making the class static is it will run the program faster.
The only problem of using iteration lies in its length of code. We cannot make it as concise and readable as we can do using recursion.
Let us try to do the same pre-order tree traversal in recursive way. Comparing these two code snippets will give us a good idea about why binary tree traversal is usually done recursively.
If you are a Java developer, you have probably guessed why we have used the pritf() method:
We want to store the data to build the structure. Until we reach the base case or leaf node, the function traversingPreOrderByCallingItself() calls itself and keeps adding the value to the node tree. Therefore, the output will be same as before.
Now what are the main differences between the iteration and recursion? For the pre-order tree traversal algorithm, things were a little bit complicated. We needed a Stack of node first, then we pushed the tree root in the Stack. After doing that we keep pushing the right child and left child; at the same time we pop all nodes one by one for each node.
But in recursion the recursive method takes a Node in parameter and and calls itself to add left child and right child. The code snippet, not only looks concise, but it is also more readable.
Post Order Traversal
To visit all the nodes and print their values are known as tree traversal. Just like LinkedList, in pre-order traversal, we start from the root or head node and then we move until we reach the leaf node.
The post-order tree traversal represents just the opposite. Here the root node is visited last. We start from the left sub-tree; then we visit the right sub-tree, and after that we reach the root node.
Therefore, we should use recursion to visit the left sub-tree; then visiting right sub-tree recursively leads us to our final destination-the root node. The post-order tree traversal follows this algorithm.
Let us see the post-order tree traversal using iteration as we have done in the previous case of pre-order traversal.
We have mentioned the output in our comments section. The same output waits to greet us below.
As always, this type of tree traversal can be managed in a better way using recursion. Visiting the left sub-tree leads us to more recursion of right sub-tree, and finally we reach the root node.
The next code snippet shows us how we can do the same operation in a more concise way using recursion.
Now we can compare the code snippets where we have once used iteration and after that we have used recursion. Still the output is same.
Whenever we use pre-order traversal, the displayed data are different than case when we use post-order traversal. Moreover, it depends on how we want to visit our nodes.
In this section, finally we will use in-order tree traversal, which is different than the previous two cases of tree traversals.
In Order Tree Traversal
When we use in-order tree traversal, we start visiting left sub-tree recursively first. After that, we visit the root and then the right sub-tree. Each node stores a value and we know them as key. In the in-order tree traversal, the output produces sorted key values in an ascending order.
Just like the previous two cases, we will do the in-order tree traversal using iteration first. After that, we will use the recursion.
The inner mechanism of in-order tree traversal is not evident in the output. As we have mentioned earlier the output will be presented in an ascending order.
The main disadvantages of using iteration include the lack of conciseness. For a huge binary tree the lines of code could be too long. Maintaining such huge data could be an issue, yet it is faster than using recursion.
The next code snippet will show how we can use recursion to make the same code look more concise.
If we take a close look at this part of the above code it will give us more insights into the algorithm of recursion.
In the in-order tree traversal, we have recursion this way. However, the post-order algorithm of recursion was different. Let us the same part in the post-order recursive tree traversal.
Finally, we will take a look at the pre-order recursive tree traversal algorithm. The position of calling the function itself changes with each recursive tree traversal style.
How we use our recursive algorithm is extremely important. Although recursion has many advantages and disadvantages, we need to use this algorithm wisely. We need to understand how it works, we need to keep in our mind one key conception regarding recursive algorithm, which is there must be a base case. Recursive algorithm should have a base case, and in the above examples, we have seen the how we proceed towards the base case where we have a condition to stop the recursion.
The condition checks whether the root is null or not. That was our base case when we had been traversing the tree. Whatever be the style of tree traversal; pre-order, post-order or in-order. We have to maintain the same logic when we use recursion.
We have seen and discussed a small part of algorithm; it is as much as a teaspoon holds. In the coming section we will get more insights about data structure.
Collection Framework in Java
We have already learned that Java Collection Framework has three key components that work together.
First one is Interface.
The second one is Implementations; classes that implement interfaces. We get objects from those classes on which algorithmic operations are performed.
The third and the final one is Algorithm. This is the final goal; because algorithms in Java Collection framework are methods that perform useful computations, such as sorting, searching, shuffling, etc. Algorithms are polymorphic; it means, the same method can be used by an Iterator object, and as well as an ListIterator object.
We will see those examples in a minute.
We have just said polymorphic algorithm methods. Let us see how add() methods work on different Collection objects.
The above code snippets give us the following output; furthermore, from this output, we understand that we can do the same computation using different types of collection objects.
As we progress, we will learn more about data structure and algorithm through Java Collection framework. We will also find how Java Collection framework unifies its implementations using many discrete mathematical abstractions.
Discrete Mathematical Abstractions and Implementation through Java Collection
We have not forgotten the key discourse of this book. Discrete Mathematics, data structure and algorithm are inter-connected. It is evident when Java Collection Framework introduces Set collection. The Set Collection in Java does not allow duplicate elements, just like mathematical Set abstraction does in the same way. Moreover, Java Set models on discrete mathematical set abstraction.
Let us see an example:
We have used sorting algorithm in the above Set Collection. After sorting is over, we have easily picked up the first and the last element.
We have said earlier that the Set Collection implements discrete mathematical Set abstraction. Let us check it by changing this line of the above code.
Run the code and you will get the following output, where one ‘100’ is missing.
In many cases, discrete mathematical abstractions are implemented in data structure and algorithm. In future code snippets, we will see more examples like above, where same things happen, in like manner.
There are more mathematical abstractions wait for us. Map Collection models mathematical abstraction, such as ‘function’. Let us see how it works in Map Collection.
It gives us the following output, where the key and value pairs work together.
Actually, the Map interface provides a small nested interface called Map.Entry. You have seen in the above code snippets. It is a part of the Collection view methods. The Map interface includes methods or algorithm for all type of basic operations on data structures, such as put, get, remove, etc. Map interface provides bulk operations, such as putAll or clear. There are Collection view algorithms also, keySet, entrySet, and values are among them.
Comparator, Comparable and Iterator
Java Collection framework provides three major interfaces, which have all the qualities of being important and worthy of note. Comparison plays a great role in sorting or shuffling algorithm. In the like manner, iteration is also very important.
We are going to see a few code snippets where these three interfaces ( Comparator, Comparable and Iterator ) are implemented.
To get elements in sorted order, we can use TreeSet and TreeMap from Java Collection Framework; but, it is the Comparator or the Comparable interface that precisely defines what sorted order means.
Implementing the Comparator and Comparable interfaces, we can have objects that encapsulate ordering. Watch the next code snippet:
First of all, we have sorted the names of the account holders; in similar fashion, we have printed the names of account holders based on sorted account numbers in ascending numbers.
What will happen if inadvertently someone adds a negative account number? Therefore, for the second part of the code where we have implemented Comparator interface method compare(), we should write the logic in this way.
Now, our code is more protected. Why we need to take such protections? It is little bit theoretical and this book is not about only Java Collection Framework. Yet, it is good to know that if an integer is a large positive integer and another integer is a large negative integer, then their subtraction will return a negative integer. To represent the difference of two arbitrary signed integers, a signed integer type is not big enough.
When we implement Comparable or Comparator interfaces, we need to maintain the technical restrictions.
If we want to compare two elements, especially for that type of algorithm, implementing the Comparable interface is always the better choice.
We can get the city names in ascending order. For the algorithm that is related to sorting and comparing, it is a good choice in Java Collection framework.
Finally we will curtain the Java Collection framework with iteration. It is also a very important part of algorithm and data structure. Java Collection framework handles it quite well by implementing the Iterator interface.
To traverse through a collection of elements, we need iteration; and, to do that we need an iterator object that implements the Iterator interface.
When we want to iterate through the elements in a collection, and display each element, the easiest way is shown above. Employing an iterator object is the best solution to such algorithmic problems. The iterator object either implements the Iterator interface, or implements ListIterator interface.
We have learned some key concepts about data structure and algorithm; moreover, we have also seen how they model the discrete mathematical abstractions.
In the next section, we will find some more interesting facts about data structure and algorithm through C++ Standard Template Library.
Standard Template Library in C++
We should avoid the practice of being unjust to C++ Standard Template Library, in short, STL. It is a very big topic that we cannot discuss, entirely, in a small section. We can only compare it with Java Collection framework.
C++ STL mainly deals with three main components. They are, Container, Algorithm, and Iterator. However, before moving to the STL, we will try to understand what templates are.
In C++, we use templates to create generalized functions and classes.
Why we need generalized functions and classes? For many reasons, of course. But, the foremost among them is through generalized functions and classes, we can use any data types, such as integer, floating point values, characters, etc. The list is not finished yet, we can use also some user defined data.
Theoretically, templates are foundations of generic programming. You can write code in a way, which is independent of any particular type.
Using template we can create a generic class or function.
The C++ STL containers, iterators, and algorithms are ideal examples of generic programming. C++ Standard Template Library has been developed using template concept.
For that reason, let us try to understand first, how these template concepts work.
In the above code, we have created a general template method ‘FindMax()’. It helps us to find the maximum number. We can pass integers as well as floating point values.
We can use templates to create not only generalized functions, we can create generalized classes. Now using generalized classes, we can apply any data type, such as int, float, char, etc. In some cases, we can use the template specialization that helps us to use any particular type of data, like char. By this template generalization we can apply special template function for specialization. We will come to that point in a minute, before that let us see how generalization of template classes is used.
We have created a general template class of Stack data structure that organizes data by adding value on the top of the table. It can also remove that data as well.
A boolean method to create whether the stack is empty or not, is also checked.
Now we can add more functionalities to this general template Stack class. It will give you an idea how STL in C++ works behind the scene.
Because we have created generic methods, we can now create Stack object of int data type that will implement a few key algorithms, such as adding, removing, etc.
The above code snippets give us an idea how STL provides common data structures, such as lists, stacks, arrays, etc.
To understand STL properly, we need to learn how to create template classes are created.
The next code snippet gives us a more robust flavor of stack data structure where we have removed all the elements and caught the exception.
Let us see the output first. After that, we will see why we needed such working experience of creating general template classes and generic methods.
Now, one thing is clear. The Standard Template Library (STL) is a set of C++ template classes that provide common programming data structures and functions, such as lists, stacks, etc. The three major components, container classes,algorithms and iterators work at tandem. In the above examples, we have seen how the components are parameterized.
You approach to understand STL whatever way, in our view, this is the best way. Try to gain the working experience of creating your own general template functions and classes.
Therefore, we will devote more time to understand this conception, after that, we will see a couple of STL examples.
Let us create a simple general template function, as well as a specialized template function.
The above code snippet is fairly simple. Through the general template function, we can pass any data, like int, float, char, string, even user defined data. However, using specialization makes a key difference. We can now only pass char data.
In the same way, we can create general template class and method that will allow to pass any data of our choice.
It appears in the above code, our choice has no limitations. Now we can create an object and call the same method again and again to pass any data.
In like manner we can also create a specialized template class and method that will allow a certain type of data to be passed.
A specialized template class that allows only int data type. If you pass a character, it will convert that value to its ASCII equivalent. Watch the output:
We have had enough working experience with the general template classes and functions. Now we can taste of STL. Before using STL, we need to know that container classes mainly manage the data structures part.
Basically, containers store user defined data and primitive data. There are several standard container classes, header files, etc.
In such a small section we cannot elaborate them. We can have a brief look at how they are used. There are sequence containers that implement a special type data structures. The specialty is it can be accessed in sequential manner. It includes vector, list, array, etc. With the new version of C++, something new will be added in the future. Therefore, the list is incomplete.
There are also associative containers that implement a certain type of data structures. Associative containers have set or map; they implement sorted data structures that can be structured in a key and value pair. We will a few examples.
Let us start with ‘list’. We will also check different types of STL algorithms that are associated with ‘list’.
We have implemented a couple of algorithms that show general functionalities, such as sorting, searching, adding, removing, or copying.
Hopefully, comments will guide us how STL data structures and algorithms work clasping each other’s hands.
Another good sequential container class is ‘vector’. It is better than array in one sense. It can automatically adjust its storage capabilities. The following example will show us how it works.
With the help of ‘vector’ we can definitely solve certain type of problems. The only advantage is with array, we need to do the lower level plumbing. We can avoid them by implementing vector data structures and associated algorithms.
Whatever data structures we implement, or whatever algorithms we use, we need to keep one thing in our mind. Time-complexity is a big factor. In the next example, we will see ‘set’, and after that ‘map’. Both belong to the Associative Container classes. There is an advantage. Because it implements sorted data structures, the search operation takes less time.
In the next chapter we will try to understand how time-complexity works, and why it is necessary in programming.
Before that, in the next examples, let us see how the ‘set’ data structures from STL sets in with its key algorithms.
As we have seen in the above code, there are lots of different types of algorithms applied in ‘set’. Moreover, it is implemented as sorted data structures. But we can always change that order, quite easily.
As an associative container class, ‘map’ has also many advantages; with the help of STL algorithms, we can operate on any value through keys.
To show the key, value pair properly, we have to use the tabular format. Although we have not use many algorithms, yet hopefully it will give us an idea about how ‘map’ works.
So far, we have taken a very brief look at the C++ STL, and before that, we have also seen Java Collection framework in action.
In the next chapter we will try to understand another key concept, time-complexity.
Question 1: Every high level language comes with its own algorithm library.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 2: Semantically, when we say that a function calls itself directly or indirectly, it is called recursion.
Option 1: The above statement does not make any sense.
Option 2: Based on this statement the recursive algorithm is formed.
Option 3: The recursion has its own limitations if we cannot stop it.
Option 4: Recursive algorithm is the fastest among all.
Answer: Option 2 and 3. Both are true in this case.
========================
Question 3: To prevent the infinite recursion, we always provide a condition, or base case.
Option 1: True
Option 2: False
Answer: Option 1
=======================
Question 4: The ‘container classes’ in C++ and Collection framework in language like Java play the same role.
Option 1: False
Option 2: True
Answer: Option 2
=======================
Question 5: For a huge binary tree using iteration could be too long as long as code lines are concerned. But it is faster than using recursion.
Option 1: The above statement is false.
Option 2: Iteration is always slower than recursion.
Option 3: Although the code could be too long, it faster than recursion.
Option 4: Recursion is faster than Iteration
Answer: Option 3
Challenge 1 : Standard Template Library (STL) provides many generic versions of standard algorithms that replace our low-level plumbing. Can you show the difference by writing code.
Solution to Challenge 1:
Language Used: C++
Challenge 2 : Can you prove with an example how the STL makes a difference when we want to sort a list of data.
Solution to Challenge 2:
Language Used: Java
// Here goes the output with the elements in ascending order.
’’’’’’'’Conclusion’’’’’’’’’’’’
In any high-level language, the name of the Standard Template Library may differ, but it drastically reduces the code size.
Challenge 3 : Using recursion in programing is closer to our discrete mathematical definitions. Can you prove it?
Solution to Challenge 3:
Language Used: Java
Mathematically, the Fibonacci is defined as the following:
Challenge 4 : Recursive algorithm should have a base case. Write a program where we can proceed towards the base case and a condition to stop the recursion.
Solution to Challenge 4:
Language Used: Java
Challenge 5 : Can you prove that Discrete Mathematics, data structure and algorithm are inter-connected?
Solution to Challenge 5:
Language Used: Java
’’’’’’’’’’’’'’CONCLUSION’’’’’’’’’’’’’’’’
The Set Collection in Java does not allow duplicate elements, just like mathematical Set abstraction does in the same way. Moreover, Java Set models on discrete mathematical set abstraction.
’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’
Challenge 6 : Can you create a general template class and method that will allow to pass any data of your choice.
Solution to Challenge 6:
Language Used: C++
8. Time Complexity
We have already seen many examples of different kind of data structures and algorithms. We have also found out that to implement such algorithms, we do not have to go through low-level plumbing. Any stable good programming language provides many libraries to avoid such low-level plumbing and different types of algorithms, such as sorting, shuffling, or searching can be done through them.
We have also found out another important fact that tells us about the relationship between algorithms and discrete mathematics. Data structures are discrete structures and hence, the algorithms are all about discrete structures.
Let us consider a simple Java program, where we iterate over two loops at the same time. These outer and inner loops are connected with each other and they finally give us an output.
And from this very simple program, we get an output like the following:
You have probably noticed that it is a simple algorithm of finding the total of inner loop and the outer loop.
However, to do that, we need to jump sequentially; each iteration has taken place within a space of discrete data structure.
Any problem has one or more possible solutions. Suppose, we were asked to find the total of the one loop only; we should change the above code to the following code:
Now, in the changed circumstance, we will come up with this output:
Now we can also solve this problem another way. By calling the function recursively, we can also solve the same problem. Our problem was, how we could get the total of a series of positive integers that starts from 0 and ends at 5.
It looks like this: total = 0 + 1 + 2 + 3 + 4 + 5; here the end point was chosen as 5. The value of total is 15.
We could have taken the user input and get the total of any big positive integer. Manually it is easy to get the total of a small series. But consider a case, where user gives us an input like 10 to the 6. Now, manually it is not possible to add all the numbers from 0 to that number.
It becomes cumbersome.
Therefore our first algorithm was based on using looping. It easily adds all the numbers, whatever be the end number.
As we have said, any problem might have one or more solutions. Here we can solve the same problem, recursive way.
And we get the same output:
In the above case, the same jumping of iteration occurs through a discrete structure, but in a different way. It starts from a discrete positive integer 5, and the addition process goes on until we reach the base number by calling the function recursively.
Now we have found that this problem has two solutions. Question is which is desirable? Which algorithm is better than the other?
Here comes the question of time complexity.
Time-Complexity has nothing to do with the execution time.
But it has many things to do with the algorithm. Time-Complexity talks all about the better algorithm. A better algorithm always takes less time to reach the desirable discrete element (here output of a total).
Here we have to iterate over a series of positive integers,to reach our desirable goal. If we could have reached in one iteration, that would be the best algorithm, no doubt.
But it is Utopian.
On the contrary, the worst case will take us to an iteration, that has to traverse each element for an indefinite period of time. In like manner, you may imagine a situation where user gives an input to find whether the integer is prime or not. There are several algorithms to test a number whether it is prime or not. Although there will be one algorithm that is better than other, and takes less time to give you the output. It depends entirely on the size of the input. For a real big integer, one algorithm might take several minutes to complete the operation and for another algorithm, it finishes the operation in a few seconds.
Before trying to understand Time-Complexity, we should remember that actual time requires to execute code is machine dependent; another key thing is network load. But the Time-Complexity has nothing to do with your machine configuration. It is all about better algorithm. That is why, whenever the terms data structures and algorithms appear, time-complexity comes with them.
In this chapter, we will try to understand how time-complexity works. What does the term Big O notation stand for? But before that, we need to understand what does ‘order of n’ mean?
Order of n, or O(n)
Let us implement an algorithm that will check whether the number or ‘n’ is prime or not. We know that a prime number is divided by only two numbers, 1 and the number itself and it has no remainder. Consider a prime number 11, it has only two factors 1, and 11. If we divide 11 by 1 and 11, we have no remainders. That is the most basic definition.
Therefore, using a normal looping construct, we can try to check whether that number is prime or not. We can write our algorithm in natural language, this way:
We can test this algorithm by using a small positive integer like 11. In that case, if we iterate from 2 to (11 – 1), we will see that between 2 to 10, there is no integer that can divide 11 with no remainder. Therefore, 11 is prime. When the value of n is small, our algorithm does not take much time. In fact, that time may appear negligible. Suppose for each iteration, it takes one millisecond or ms. This fraction of second stands for 10 to the power minus 3, that is, if you divide 1 second by 1000, then you get one millisecond.
By the way, as a student of computer science student we should know that one microsecond is 10 to the power minus 6, one nanosecond is 10 to the power of minus 9 and one picosecond is 10 to the power minus 12.
Now, let us assume that each iteration takes one microsecond. When the number of iteration is small, it does take mush time. However, with the increase of iteration, this time also increases. Instead of 11, we want to check a value like 1000003, it will iterate more than one million times. Now our algorithm appears to crumble. Because our algorithm will take huge time to finish the process of iteration.
We can transport this natural language algorithm to a Java code.
We are not going to run this code; it is just for an example of time-complexity. In the above code you can guess that to reach our goal we need to iterate ‘n’ number of times. Here ‘n’ is user’s input. It is called ‘order of n’ or O(n). As the value of ‘n’ starts increasing, our algorithm starts crumbling. For a value like 10 to power 6 plus 3, it takes huge to time and it simply goes out of our hands.
Therefore, we have to search for a better algorithm to solve the same problem.
Let us take the user’s input and instead of iterating over the whole range of that number, we can calculate the square root of that user’s input. After that we can iterate up to that value.
It serves our purpose, and at the same time, shortens the length of iteration in a great way. Let us write the code in Java.
If we have a user’s input like 1000003, our algorithm allows us not to iterate more than 1 million times. On the contrary, it allows us to iterate in and around 1000 times.
We write another short Java code to find the square root of any number.
In the above code we can clearly see that the square root of 1000003.
Let us go back to the code 8.5, where we have successfully shortened the length of iteration by calculating the square root of user’s input. Reducing the length of iteration means the algorithm takes much less time. Now the running time of our code is not ‘order of m’, but ‘order of square root of n’; it means O(square root of n).
It is quite evident that code 8.5 represents much better algorithm than code 8.4 to solve the same problem. Why so? Because, it solves the large test cases in acceptable time. Better algorithm is necessary for better performance.
Here the running time of the program depends on the user’s input; and, user’s input represents the growth of time which exerts influence on the running time.
As a conclusion to this section, we can say that time-complexity only cares about the ‘input’. It also cares about the algorithm that tackles the ‘input’ in a better way.
Moreover, time-complexity does not care about the machine’s hardware configuration. Whether the machine is single processor or multi processor does not matter. What is the network load, is unimportant. The time-complexity really does not care about 32 bit or 64 bit.
Input is the key word for time-complexity. Nothing else.
In the above examples, we have inputs like single integer. It could have been an array, a data structure; is not it? Now, again, we come to the point of the data structures, and algorithms. And, of course, now we also understand how time-complexity is related to them.
Big O Notation
We have already seen how through order of ‘n’, or ‘input’, we can represent any algorithm. Consider the code 8.4, where we have to iterate over one million times if ‘n’ is 1000003. Therefore, the O(n) is the worst case scenario. Any algorithm could not be worse than that O(n).
Big O notation is the most convenient way to express the worst-case scenario for an algorithm. Compare code 8.5 with the previous code 8.4. In code 8.5, the Big O notation is like this: O(square root of n). This algorithm could not be worse than anything. Now, comparing these two worst case scenarios, we can safely tell that the algorithm of code 8.5 is better than code 8.4.
If you are still confused with the definition, do not worry. Everything takes time. The concept of time-complexity is related to algorithm. More you practice,you will be able to get your head around this topic.
Let us try to call back the second code snippet of this chapter. The code 8.2 looks like this:
For brevity we have trimmed the code omitting the commented parts. When we have declared the variable ‘totalOne’ to 0, the constant is 1 and the time is also 1.
In the next step, the first line of the ‘for loop’ tells us about two things. First, the constant is more than one, and the same rule applies for the time.
In the next line the algorithm keeps adding the value of iteration for ‘n’ number of times. Therefore, we have two constants, but the time equals ‘n’.
The last line gives us the output. The constant is 1, and the time is also 1.
When we add them up, we get O(n). It happens so, because while summing it up, we remove all the constants.
However, for the code 5.1, where we have used nested loop to get the total of inner loop, this order of ‘n’ changes to (n * n), that is n squared. Therefore, the Big O notation or the worst case scenario of that algorithm is O(n2). Although in that code, we have total of the outer loop also. That means, we have O(n) at the same breath. Since there are consecutive statements, we count the statement with maximum complexity, that is O(n2).
We have got an initial idea of how Big O notation works, and what principles govern it. Many common algorithms we use very often, use the above Big O notation O(n^2).
Common algorithms, such as bubble sort, selection sort and insertion sort takes the above Big O notation.
There are many other Big O notations that are related to different types of algorithms. The idea of time-complexity rests on finding the better algorithm to solve a problem.
We should analyze the Big O notations, or worst-case scenarios for the test cases and we will try to find the best solution.
Question 1: Time-Complexity has nothing to do with the execution time. Time-Complexity talks all about the better algorithm.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 2: Whenever the terms data structures and algorithms appear, time-complexity comes with them.
Option 1: The above statement is true but it is machine-specific.
Option 2: The above statement is true but it is not machine-specific.
Option 3: The above statement is true but it is not only machine-specific, but also better algorithm-specific.
Option 4: None of the above statement is true.
Answer: Option 3
=======================
Question 3: Big O notation is the most convenient way to express the worst-case scenario for an algorithm.
Option 1: It means we can compare two algorithms and decide which one is better.
Option 2: Comparing two worst case scenarios we can decide which algorithm is better.
Option 3: Comparing two best case scenarios we can decide which algorithm is better.
Option 4: None of the above statement is true.
Answer: Option 1 and 2. Both are true.
=======================
Challenge 1 : Data structures are discrete structures and hence, the algorithms are all about discrete structures. Write a program to establish a relationship between algorithms and discrete mathematics
Solution to Challenge 1 :
Language used: Java
Challenge 2 : How we can get the total of a series of positive integers that starts from 0 and ends at 5. Can it be done recursive way?
In the above case, the same jumping of iteration occurs through a discrete structure, but in a different way. It starts from a discrete positive integer 5, and the addition process goes on until we reach the base number by calling the function recursively.
Challenge 3 : Detect the main problem in the code below and rewrite it in proper way.
Language used: Java
Solution to Challenge 3 : In the above code you can guess that to reach our goal we need to iterate ‘n’ number of times. Here ‘n’ is user’s input. It is called ‘order of n’ or O(n). As the value of ‘n’ starts increasing, our algorithm starts crumbling. For a value like 10 to power 6 plus 3, it takes huge to time and it simply goes out of our hands.
If we change the following line
to as follows
Our problem is solved. In the first line the program iterate more than 1 million times if user decides to give 1000003 as input. However, in the second case, the iteration reuces to in and around 1000 times for the same number.
Language used: Java
9. Set, Symmetric Difference and Propositional Logic
Set, Symmetric Difference and Propositional Logic are kind of heart and soul of Discrete Mathematics, as well as Computer Science. These concepts are immersed in data structures and algorithm, too. Without understanding these key concepts, we cannot move forward. Our knowledge of data structures and algorithm will remain incomplete.
In the previous chapters, we have seen a few implementations of Sets in various programming languages. We have also found out that sets are basically statements, just like we use the term statement within the coding paradigm.
Let us write a discrete mathematical Set of even numbers like this:
We can read the Set of this even numbers as a statement – it is an even number. In discrete mathematical concepts, numerical sets are considered as a collection of discrete numbers that will not contain duplicate numbers. Yet, this statement will come into possession of something else if we replace the words ‘numbers’ with ‘things’.
We will come to that point in a minute.
If two different sets contain duplicate numbers, the symmetric difference will easily point out those numbers. Set uses XOR symbol to express the inequality, and this Boolean concept is implemented in every programming language.
Here comes Propositional Logic. We cannot find Symmetric Difference of two Sets without implementing Propositional Logic.
Propositional Logic is concerned with statements that are governed by ‘truth values’; true and false. These truth values are assigned to analyze these statements either individually or in a composite manner.
These discrete mathematical concepts are not only interrelated, they also help each other clandestinely, within the coding paradigms.
We have found out in the previous chapter, in Java Collection framework, a Set is a Collection that cannot contain duplicate elements. It models the mathematical Set abstraction, Symmetric difference and Propositional logic one behind the other.
The concepts of Propositional logic plays a key role in some special cases, such as where the Set Collection defines behaviors like ‘equals’ or ‘hashCode’ operations.
Let us consider an example, where we have two Set instances. If we want to compare them, we need to model the discrete mathematical Set abstractions, as well as Symmetric difference and propositional logic. Only if two Set instances are equal, when they contain the same elements.
A simple discrete mathematical example will clarify the concept.
Here numbers 1, 8, 11 and 45 are in each Set. However, 5 and 9 are in both Sets.
While we are not going to the details of the discrete mathematical representations, nevertheless we need to understand why implementations of Set abstractions are necessary in Data Structures.
So far, we have seen some mathematical Set implementations with numerical sets. But it could be of ‘things’. Even mathematically when we define a Set, we say, a Set is a Collection. Collection of what? Collection of things. And, of course, these things should have a common property.
We can imagine a Set of biking gears, such as helmet, gloves, goggles, shoes, etc. In another example of Set we can think of our two eyes – left eye and right eye.
We have already learned the Set notations, so we can write these two examples this way:
In the first example, we have used three dots, they are called ellipsis. It means, the biking gear list is endless,or infinite. On the contrary, we have second example, where we have exact two elements.
We call the first one ‘infinite set’ and the second one is a ‘finite set’.
Set abstractions may seem pointless as far as Mathematics is concerned, but this statement is contextual. In many situations Sets can become building blocks of highly complicated mathematical concepts, such as Graph Theory, Abstract Algebra, Linear Algebra, Number theory, etc.
Furthermore, Set abstractions are building blocks of Data Structures, of which we are interested at present.
Why Set is important in Data Structures
The Set abstraction has some special characteristics. While we organize our data, we need to implement those special characteristics of Set abstraction. By organizing data we mean creation, retrieval, modification, and removal of data. In such operations, the implementations of Set abstraction comes to our help.
Let us try to understand this part.
An element can exist only once in a Set. This Collection does not allow duplication. Particularly, this trait makes Set different from others in Java Collection framework. When we organize our data structures, we can plan it in that way.
We can store unique elements. Since, by default, Set abstraction does not care about order, in some cases, we can take advantage of that character also.
Based on such characteristics, we can store discrete elements without duplication, and we need not care about the order of the elements always. Although Set does not care about order, in some cases, we do not need chaos, but order. In such cases, we have the general-purpose Set implementation ‘TreeSet’. It maintains the order of elements based on their values. But implementation of ‘TreeSet’ comes at a higher cost. It is slower than the other Set implementation ‘HashSet’, which stores data in a hash table. As long as the order of iteration is concerned, the ‘HashSet’ implementation does not guarantee that.
Let us take a look at the both implementations.
We have implemented both, the ‘HashSet’ and the ‘TreeSet’. We can see the difference in the following output:
There is another implementation, called ‘LinkedHashSet’, which is implemented as a hash table with a linked list running through it. This implementation maintains the order in a different way. As we insert new element, the insertion-order is maintained.
We have started this section with a question, why Set is important in data structures?
A one-line answer is, we can take any Collection containing duplicate elements and convert it to another Collection removing all the duplicate elements. We can do that by implementing the Set discrete mathematical abstraction.
How Symmetric Difference and Propositional Logic combine
The very conception of Symmetric Difference does not exist without the Set abstraction.
Based on that, we can easily implement that abstraction in our Java code.
In general, the symmetric difference of two Sets mean a new Set, where duplication is nonexistent.
What kind of output we can expect here? The arguments we have passed possess unique words, as well as duplicate words. We have tried to identify both the unique words, and the duplicate words. After that, we have produced them in the following output.
We have implemented the symmetric difference abstraction in a different way. Implementing the same abstraction in a different way, may have produced only the discrete words, without duplication.
We can use a few other Set Symmetric Difference; however, we need to understand other Set algebraic operations.
To do that, we can write a simple Java code that will show us how the implementation of Set abstraction does not allow duplication.
The output is quite expected. The Set will not allow duplication. Therefore, repeated entry of same element will not be stored.
In discrete mathematical paradigms, Set abstraction is basically chaotic, and unordered. But it makes the difference in one area. No duplicate element is allowed in the world of Set. The next code snippet and its output will show you the same property.
For the Java enthusiasts we will write the same code again,to show that different types of output can be produced. In fact, every high level language has many ways to produce an output.
When the implementation type is ‘HashSet’, there is no guarantee that the order will be maintained. However, if we want the list in alphabetical order, we can always use the ‘TreeSet’ implementation along with the ‘HashSet’. Changing the implementation makes our Set abstraction more robust.
When a Set is not equal to another Set, in mathematics we use a special symbol. This inequality symbol is a representation of XOR, which is actually inequality on Boolean.
Here, in this part of Set implementation, the Propositional Logic is also implemented. It is a part of Set mathematical abstraction, just like Symmetric Difference.
Question 1: We cannot find Symmetric Difference of two Sets without implementing Propositional Logic.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 2: If two different Sets contain duplicate numbers, the symmetric difference will easily point out those numbers.
Option 1: It is true only for Discrete Mathematics.
Option 2: It is true only for Discrete Mathematics, and as well as all programming languages.
Option 3: It is neither true for Discrete Mathematics, nor for any programming language.
Option 4: None of the above statement is true.
Answer: Option 2.
=======================
Question 3: We can take any Collection containing duplicate elements and convert it to another Collection removing all the duplicate elements.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 4: In Java Collection framework, a Set is a Collection that cannot contain duplicate elements.
Option 1: The above statement does not make any sense because Set is completely different concept.
Option 2: The above statement is partly true.
Option 3: It models the mathematical Set abstraction, Symmetric difference and Propositional logic one behind the other.
Option 4: None of the above statement is true.
Answer: Option 3
=======================
Challenge 1 : Why the Set implementation ‘HashSet’ is better than the general-purpose Set implementation ‘TreeSet’? Can you compare and prove that?
Solution to Challenge 1 :
Language Used: Java
’’’’’’’’’'’EXPLANATION ‘’’’’’’’’’’’
Although the ‘TreeSet’ maintains the order of elements based on their values. But implementation of ‘TreeSet’ comes at a higher cost. It is slower than the other Set implementation ‘HashSet’, which stores data in a hash table and maintains the time complexity.
’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’’
Challenge 2 : Can you combine Symmetric Difference and Propositional Logic in one program? You can write it in any programming language.
The arguments we have passed possess unique words, as well as duplicate words. We have tried to identify both the unique words, and the duplicate words. The symmetric difference of two Sets mean a new Set, where duplication is nonexistent.
The implementation of Set abstraction does not allow duplication.
Challenge 3 : In discrete mathematical paradigms, Set abstraction is basically chaotic, and unordered. However, no duplicate element is allowed in the world of Set. Can you prove it?
Solution to Challenge 3 :
Language Used: Java
10. Combinatorics and Counting, Permutation and Combinations
The area of mathematics in which counting is the primarily concern, it is known as Combinatorics. Besides other subjects like statistical physics, it has many applications in Computer Science.
The counting capabilities and other abstractions of Combinatorics is implemented in different ways (algorithm) to obtain results. In like manner, it also studies certain properties of finite structures, like data structures in Computer science.
Primarily, any data structure is related to the abstractions of any finite sets. We can apply different types of permutations on finite sets and that study also plays an important role in the fields of Combinatorics.
We can assume that when different types of permutations are required for data structures, different types of algorithms can also be emerged out from there.
For a single problem where enumeration or counting is required, there can be different types of algorithms. We can approach the problems whatever way. Regardless of how we solve the problems, different algorithms emerge from it.
Furthermore, to know how Combinatorics is associated with data structures and algorithm, we need to know the basic operations that are acted upon on the data structures with the help of different types of algorithms.
Be that as it may, but we cannot ignore these facts.
We have already found that enumeration of a specified data structure is a part of Combinatorics. Arrangements of a certain data structure is also a part of it. We can restructure or rearrange any ‘string’ value and place the characters in different combinations.
We can check whether a data structure fulfills a certain criteria. If there are several possibilities, we can also check, with the help of algorithms, what is the best structure or solution.
We have just learned that study of Combinatorics deals with different types of permutations on finite sets or data structures.
Therefore, to understand the key abstractions of Combinatorics we need to understand what permutations and combinations are.
Permutation and Combination
When we use the word ‘combination’, we forget whether ‘order’ is important or not. We also leave behind another key concept, ‘repetition’. When repetition is allowed, the number of combinations increase. When it is not allowed, the number of combination or, in other words, the number of rearrangement reduces.
Consider an example that will make it clear.
Take a word ‘forget’. If we go to any thesaurus and try to find what kind of word is it, we find several other words that are close to its meaning.
They are: overlook, drop, neglect, omit, leave out, etc. We can rearrange these group of words in many ways where order is not maintained. When we do not care what order the words are in, we say it a ‘combination’ of words.
However, when we apply the alphabetical order and rearrange the group of words again, then it comes out like this: drop, neglect, omit, overlook, leave out, etc. When order matters in a combination, it is called permutation.
Think about a combination lock. We can choose any three numbers from 0 to 10. If in our case, the combination works in this arrangement ‘562’, we can say that repetition is not allowed in this permutation. If it was like ‘223’, we could have said that repetition is allowed in this permutation.
Take a close look, we will find that any permutation where repetition is allowed, the number of possibilities is nothing but factorial of that number of things that are to be rearranged.
Consider any three things. How many ways we can rearrange them? Finding it is very easy.
There are 6 ways we can rearrange three elements. If there were 4 elements, the factorial of 4, that is, in 24 ways we could have rearranged those 4 things.
Enough theory; let us dive into our first code where we will find the best algorithm to solve the above problem.
The output is quite expected. When 5 things are to be rearranged, there are 120 possibilities, which is actually the factorial of 5.
What we have seen, is a tip of iceberg. In reality, there are hundreds of Combinatorial algorithms that deal with different types of data structures based on finite sets.
We can include even structures that are built from graphs.
We will talk about them in a few minutes, but before that, let us see some common examples of Combinatorial algorithms. We can think of generating list of all structures of a given type. In this scenario, we can think of permutation with repetition, or without repetition. We can think of combinations with repetition, or without repetition.
Search algorithms are good examples where Optimization and Approximation algorithms are used to solve such problems. Optimization methods or algorithms also include dynamic programming. On the other hand, Approximation methods include greedy algorithms.
We cannot say that one algorithm is the best solution. There could be a better algorithm than the previously claimed ‘best’.
Let us solve a problem and see whether the solution is best or not.
There are five things. We have found out that if there are five things, there could be 120 possibilities. Let us pick any two of them and try to rearrange them in order. However, we cannot repeat the same thing twice. It is a permutation without repetition, still it is little bit different.
If we want to rearrange any two things from that five elements without repetition, what would be the order.
Here is the explanation. When we pick up 2 elements from 5 elements, 3 elements are left behind. The algorithm is quite simple.
We have done the same thing. And here is the output:
As we have said earlier, there are hundreds of Combinatorial algorithms. Besides more common ‘sorting’ and ‘searching’, we have different types of generating ‘permutations and combinations’, graphs, and many others.
Our next problem will deal with some kind of generating permutation and combination. How many ways we can rearrange a string? If repetition is allowed, it is obvious that number of rearrangement will be more.
How it happens?
A string is a sequence of characters. It is a representation of data structure, an array of characters.
Therefore, it has a length. If repetition is allowed, we can count the length of the string, and safely say that the factorial of that number is the possibility.
But the real trick is in the process of rearrangement.
In the above code snippet, the string is of length 4.
Therefore, factorial of 4, that is 24 is the number of possibility. The above program rearranges the string 24 ways. However, the condition was repetition could be allowed.
In a given string where the length is 3, we can rearrange that string in 6 ways, which is the factorial of 3. What happens, when repetition is not allowed.
To test the above code, we have used a string where one character ‘b’ has been used twice. As a matter of fact, if repeating is allowed, there could be 6 rearrangements.
Since, repeating is not allowed, we get the above output.
Just before, in the code 10.3, we have seen an algorithm where repetition has been allowed, and accordingly we have seen the rearrangements.
We can do the same in a different way. We can rearrange the sequence of characters in such a way so that the string output would be of various orders. Some of them may look similar because we have allowed repetition.
Since the length of the string is 3, the factorial of 3 would give us the desired number. Factorial of 3 is 6, therefore, we have 6 arrangements in place.
There are hundreds of other Combinatorial problems and algorithms. To get our head around them to understand the inner logic, we need practice. Furthermore, we need to keep the issue of time complexity in our mind, at the same time.
Question 1: The key abstractions of Combinatorics depend on permutations and combinations.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 2: When order does not matter in a combination, it is called permutation.
Option 1: The above statement is true and can be proved in any programming language.
Option 2: The above statement is true mathematically, but we cannot implement it in any programming language.
Option 3: The above statement is neither true mathematically, nor we can implement it in any programming language.
Option 4: None of the above statement is true.
Answer: Option 4.
=======================
Question 3: A string is a sequence of characters. It is a representation of data structure, an array of characters.
Option 1: True
Option 2: False
Answer: Option 1
========================
Question 4: Search algorithms are good examples where Optimization and Approximation algorithms are used to solve such problems.
Option 1: The above statement is partly true.
Option 2: The above statement is does not make any sense.
Option 3: The above statement is absolutely true.
Option 4: None of the above statement is true.
Answer: Option 3.
=======================
Question 5: There are hundreds of Combinatorial algorithms.
Option 1: True
Option 2: False
Answer: Option 1
=======================
Challenge 1 : How many ways we can rearrange 5 balls.
Solution to Challenge 1:
Language Used: Java
Challenge 2 : Suppose you have a string “abcd”. You are aked to rearrange the String. How many ways could you rearrange the String when repetition is allowed? Here repetition means, you can rearrange this way: bacd, cdba, dbac, etc.
Solution to Challenge 3:
Language Used: Java
Challenge 3 : Can you show the difference between two types of permutation,where repetition is not allowed and allowed.
Solution to Challenge 3:
Language Used: Java
What Next
If you have read the whole book and reach to this point, it is worthy of high praise.
I have tried my best to weave three important components of Computer science and discrete mathematics, data structure and algorithm. Although they appear to be difficult at first glance, if you understand the key concepts, and understand how these three components are internally related, it no longer seems to be so difficult anymore.
The topic winds through the various programming languages, yet for the intermediate learners that should not be a big issue. Although programming languages have different syntax, the basic structure of every high-level language is same. Especially it is true for discrete mathematics, data structure and algorithm.