DBMS

Database Management System

Training Summary

Database Management System (DBMS) is a collection of programs which enables its users to access a dbms database, manipulate data, reporting/representation of data. This is a complete course on DBMS for beginners. These online notes cover basics to advance topics like DBMS architecture, data model, ER mdoel diagram, relational calculur and algebra, concurrency control, keys, data independence,etc.

What is a Database?

A database is a collection of related data which represents some aspect of the real world. A database system is designed to be built and populated with data for a certain task. What is DBMS?

Database Management System (also known as DBMS) is a software for storing and retrieving users’ data by considering appropriate security measures. It allows users to create their own databases as per their requirement.

It consists of a group of programs which manipulate the database and provide an interface between the database. It includes the user of the database and other application programs.

The DBMS accepts the request for data from an application and instructs the operating system to provide the specific data. In large systems, a DBMS helps users and other third-party software to store and retrieve data.

History of DBMS

Here, are the important landmarks from the history:

  • 1960 – Charles Bachman designed first DBMS system
  • 1970 – Codd introduced IBM’S Information Management System (IMS)
  • 1976- Peter Chen coined and defined the Entity-relationship model also know as the ER model
  • 1980 – Relational Model becomes a widely accepted database component
  • 1985- Object-oriented DBMS develops.
  • 1990s- Incorporation of object-orientation in relational DBMS.
  • 1991- Microsoft ships MS access, a personal DBMS and that displaces all other personal DBMS products.
  • 1995: First Internet database applications
  • 1997: XML applied to database processing. Many vendors begin to integrate XML into DBMS products.

Characteristics of Database Management System

  • Provides security and removes redundancy
  • Self-describing nature of a database system
  • Insulation between programs and data abstraction
  • Support of multiple views of the data
  • Sharing of data and multiuser transaction processing
  • DBMS allows entities and relations among them to form tables.
  • It follows the ACID concept ( Atomicity, Consistency, Isolation, and Durability).
  • DBMS supports multi-user environment that allows users to access and manipulate data in parallel.

Popular DBMS Software

Here, is the list of some popular DBMS system:

  • MySQL
  • Microsoft Access
  • Oracle
  • PostgreSQL
  • dBASE
  • FoxPro
  • SQLite
  • IBM DB2
  • LibreOffice Base
  • MariaDB
  • Microsoft SQL Server etc.

Advantages of DBMS

  • DBMS offers a variety of techniques to store & retrieve data
  • DBMS serves as an efficient handler to balance the needs of multiple applications using the same data
  • Uniform administration procedures for data
  • Application programmers never exposed to details of data representation and storage.
  • A DBMS uses various powerful functions to store and retrieve data efficiently.
  • Offers Data Integrity and Security
  • The DBMS implies integrity constraints to get a high level of protection against prohibited access to data.
  • A DBMS schedules concurrent access to the data in such a manner that only one user can access the same data at a time
  • Reduced Application Development Time

Disadvantage of DBMS

DBMS may offer plenty of advantages but, it has certain flaws-

  • Cost of Hardware and Software of a DBMS is quite high which increases the budget of your organization.
  • Most database management systems are often complex systems, so the training for users to use the DBMS is required.
  • In some organizations, all data is integrated into a single database which can be damaged because of electric failure or database is corrupted on the storage media
  • Use of the same program at a time by many users sometimes lead to the loss of some data.
  • DBMS can’t perform sophisticated calculations

What is a Database Transaction?

A transaction is a logical unit of processing in a DBMS which entails one or more database access operation. In a nutshell, database transactions represent real-world events of any enterprise.

All types of database access operation which are held between the beginning and end transaction statements are considered as a single logical transaction. During the transaction the database is inconsistent. Only once the database is committed the state is changed from one consistent state to another.

What is a Database Transaction?

A transaction is a logical unit of processing in a DBMS which entails one or more database access operation. In a nutshell, database transactions represent real-world events of any enterprise.

All types of database access operation which are held between the beginning and end transaction statements are considered as a single logical transaction. During the transaction the database is inconsistent. Only once the database is committed the state is changed from one consistent state to another.

Difference Between Primary key & Foreign key

Primary Key Foreign Key
Helps you to uniquely identify a record in the table. It is a field in the table that is the primary key of another table.
Primary Key never accept null values. A foreign key may accept multiple null values.
Primary key is a clustered index and data in the DBMS table are physically organized in the sequence of the clustered index. A foreign key cannot automatically create an index, clustered or non-clustered. However, you can manually create an index on the foreign key.
You can have the single Primary key in a table. You can have multiple foreign keys in a table.

Summary :

  • A DBMS key is an attribute or set of an attribute which helps you to identify a row(tuple) in a relation(table)
  • DBMS keys allow you to establish a relationship between and identify the relation between tables
  • Seven Types of DBMS keys are Super, Primary, Candidate, Alternate, Foreign, Compound, Composite, and Surrogate Key.
  • A super key is a group of single or multiple keys which identifies rows in a table.
  • A column or group of columns in a table which helps us to uniquely identifies every row in that table is called a primary key
  • All the keys which are not primary key are called an alternate key
  • A super key with no repeated attribute is called candidate key
  • A compound key is a key which has many fields which allow you to uniquely recognize a specific record
  • A key which has multiple attributes to uniquely identify rows in a table is called a composite key
  • An artificial key which aims to uniquely identify each record is called a surrogate key
  • Primary Key never accept null values while a foreign key may accept multiple null values.

What is Data Independence of DBMS?

Data Independence is defined as a property of DBMS that helps you to change the Database schema at one level of a database system without requiring to change the schema at the next higher level. Data independence helps you to keep data separated from all programs that make use of it.

You can use this stored data for computing and presentation. In many systems, data independence is an essential function for components of the system.

Types of Data Independence

In DBMS there are two types of data independence

  1. Physical data independence
  2. Logical data independence.

Levels of Database

Before we learn Data Independence, a refresher on Database Levels is important. The database has 3 levels as shown in the diagram below

  1. Physical/Internal
  2. Conceptual
  3. External

Physical Data Independence

Physical data independence helps you to separate conceptual levels from the internal/physical levels. It allows you to provide a logical description of the database without the need to specify physical structures. Compared to Logical Independence, it is easy to achieve physical data independence.

With Physical independence, you can easily change the physical storage structures or devices with an effect on the conceptual schema. Any change done would be absorbed by the mapping between the conceptual and internal levels. Physical data independence is achieved by the presence of the internal level of the database and then the transformation from the conceptual level of the database to the internal level.

Examples of changes under Physical Data Independence

Due to Physical independence, any of the below change will not affect the conceptual layer.

  • Using a new storage device like Hard Drive or Magnetic Tapes
  • Modifying the file organization technique in the Database
  • Switching to different data structures.
  • Changing the access method.
  • Modifying indexes.
  • Changes to compression techniques or hashing algorithms.
  • Change of Location of Database from say C drive to D Drive

Logical Data Independence

Logical Data Independence is the ability to change the conceptual scheme without changing

  1. External views
  2. External API or programs

Any change made will be absorbed by the mapping between external and conceptual levels.

When compared to Physical Data independence, it is challenging to achieve logical data independence.

Examples of changes under Logical Data Independence

Due to Logical independence, any of the below change will not affect the external layer.

  1. Add/Modify/Delete a new attribute, entity or relationship is possible without a rewrite of existing application programs
  2. Merging two records into one
  3. Breaking an existing record into two or more records

Difference between Physical and Logical Data Independence

Logica Data Independence Physical Data Independence
Logical Data Independence is mainly concerned with the structure or changing the data definition. Mainly concerned with the storage of the data.
It is difficult as the retrieving of data is mainly dependent on the logical structure of data. It is easy to retrieve.
Compared to Logic Physical independence it is difficult to achieve logical data independence. Compared to Logical Independence it is easy to achieve physical data independence.
You need to make changes in the Application program if new fields are added or deleted from the database. A change in the physical level usually does not need change at the Application program level.
Modification at the logical levels is significant whenever the logical structures of the database are changed. Modifications made at the internal levels may or may not be needed to improve the performance of the structure.
Concerned with conceptual schema Concerned with internal schema
Example: Add/Modify/Delete a new attribute Example: change in compression techniques, hashing algorithms, storage devices, etc

Importance of Data Independence

  • Helps you to improve the quality of the data
  • Database system maintenance becomes affordable
  • Enforcement of standards and improvement in database security
  • You don’t need to alter data structure in application programs
  • Permit developers to focus on the general structure of the Database rather than worrying about the internal implementation
  • It allows you to improve state which is undamaged or undivided
  • Database incongruity is vastly reduced.
  • Easily make modifications in the physical level is needed to improve the performance of the system.

Summary

  • Data Independence is the property of DBMS that helps you to change the Database schema at one level of a database system without requiring to change the schema at the next higher level.
  • Two levels of data independence are 1) Physical and 2) Logical
  • Physical data independence helps you to separate conceptual levels from the internal/physical levels
  • Logical Data Independence is the ability to change the conceptual scheme without changing
  • When compared to Physical Data independence, it is challenging to achieve logical data independence
  • Data Independence Helps you to improve the quality of the data.

 Complete Difference between DBMS and RDBMS

What is DBMS?

A DBMS is a software used to store and manage data. The DBMS was introduced during 1960’s to store any data. It also offers manipulation of the data like insertion, deletion, and updating of the data.

DBMS system also performs the functions like defining, creating, revising and controlling the database. It is specially designed to create and maintain data and enable the individual business application to extract the desired data.

What is RDBMS?

Relational Database Management System (RDBMS) is an advanced version of a DBMS system. It came into existence during 1970’s. RDBMS system also allows the organization to access data more efficiently then DBMS.

RDBMS is a software system which is used to store only data which need to be stored in the form of tables. In this kind of system, data is managed and stored in rows and columns which is known as tuples and attributes. RDBMS is a powerful data management system and is widely used across the world.

Difference between DBMS vs RDBMS

Parameter DBMS RDBMS
Storage DBMS stores data as a file. Data is stored in the form of tables.
Database structure DBMS system, stores data in either a navigational or hierarchical form. RDBMS uses a tabular structure where the headers are the column names, and the rows contain corresponding values
Number of Users DBMS supports single user only. It supports multiple users.
ACID In a regular database, the data may not be stored following the ACID model. This can develop inconsistencies in the database. Relational databases are harder to construct, but they are consistent and well structured. They obey ACID (Atomicity, Consistency, Isolation, Durability).
Type of program It is the program for managing the databases on the computer networks and the system hard disks. It is the database systems which are used for maintaining the relationships among the tables.
Hardware and software needs. Low software and hardware needs. Higher hardware and software need.
Integrity constraints DBMS does not support the integrity constants. The integrity constants are not imposed at the file level. RDBMS supports the integrity constraints at the schema level. Values beyond a defined range cannot be stored into the particular RDMS column.
Normalization DBMS does not support Normalization RDBMS can be Normalized.
Distributed Databases DBMS does not support distributed database. RBMS offers support for distributed databases.
Ideally suited for DBMS system mainly deals with small quantity of data. RDMS is designed to handle a large amount of data.
Dr. E.F. Codd Rules Dbms satisfy less than seven of Dr. E.F. Codd Rules Dbms satisfy 8 to 10 Dr. E.F. Codd Rules
Client Server DBMS does not support client-server architecture RDBMS supports client-server architecture.
Data Fetching Data fetching is slower for the complex and large amount of data. Data fetching is rapid because of its relational approach.
Data Redundancy Data redundancy is common in this model. Keys and indexes do not allow Data redundancy.
Data Relationship No relationship between data Data is stored in the form of tables which are related to each other with the help of foreign keys.

Top 35 Database(DBMS) Interview Questions & Answers

1) Define Database.

A prearranged collection of figures known as data is called database.

2) What is DBMS?

Database Management Systems (DBMS) are applications designed especially which enable user interaction with other applications.

3) What are the various kinds of interactions catered by DBMS?

The various kind of interactions catered by DBMS are:

  • Data definition
  • Update
  • Retrieval
  • Administration

4) What are the features of Database language?

A database language may also incorporate features like:
DBMS-specific Configuration and management of storage engine
Computations to modification of query results by computations, like summing, counting, averaging, grouping, sorting and cross-referencing Constraint enforcement Application Programming Interface

5) What do database languages do?

As special-purpose languages, they have:

  • Data definition language
  • Data manipulation language
  • Query language

6) Define database model.

A data model determining fundamentally how data can be stored, manipulated and organised and the structure of the database logically is called database model.

7) What is SQL?

Structured Query Language (SQL) being ANSI standard language updates database and commands for accessing.

8) Define Normalization.

Organized data void of inconsistent dependency and redundancy within a database is called normalization.

9) Enlist the advantages of normalizing database.

Advantages of normalizing database are:

  • No duplicate entries
  • Saves storage space
  • Boasts the query performances.

10) Define Denormalization.

Boosting up database performance, adding of redundant data which in turn helps rid of complex data is called denormalization.

11) Define DDL and DML.

Managing properties and attributes of database is called Data Definition Language(DDL).

Manipulating data in a database such as inserting, updating, deleting is defined as Data Manipulation Language. (DML)


12) Define cursor.

A database object which helps in manipulating data row by row representing a result set is called cursor.

13) Enlist the cursor types.

They are:

  • Dynamic: it reflects changes while scrolling.
  • Static: doesn’t reflect changes while scrolling and works on recording of snapshot.
  • Keyset: data modification without reflection of new data is seen.

13) Enlist the types of cursor.

They types of cursor are:

  • Implicit cursor: Declared automatically as soon as the execution of SQL takes place without the awareness of the user.
  • Explicit cursor: Defined by PL/ SQL which handles query in more than one row.

14) Define sub-query.

A query contained by a query is called Sub-query.

15) Why is group-clause used?

Group-clause uses aggregate values to be derived by collecting similar data.

16) Compare Non-clustered and clustered index

Both having B-tree structure, non-clustered index has data pointers enabling one table many non-clustered indexes while clustered index is distinct for every table.

17) Define Aggregate functions.

Functions which operate against a collection of values and returning single value is called aggregate functions

18) Define Scalar functions.

Scalar function is depended on the argument given and returns sole value.

19) Define “correlated subqueries”.

A ‘correlated subquery’ is a sort of sub query but correlated subquery is reliant on another query for a value that is returned. In case of execution, the sub query is executed first and then the correlated query.

20) Define Data Warehousing.

Storage and access of data from the central location in order to take some strategic decision is called Data Warehousing. Enterprise management is used for managing the information whose framework is known as Data Warehousing.

21) Define Join and enlist its types.

Joins help in explaining the relation between different tables. They also enable you to select data with relation to data in another table.

The various types are:

  • INNER JOINs: Blank rows are left in the middle while more than equal to two tables are joined.
  • OUTER JOINs: Divided into Left Outer Join and Right Outer Join. Blank rows are left at the specified side by joining tables in other side.

Other joins are CROSS JOINs, NATURAL JOINs, EQUI JOIN and NON-EQUI JOIN.

22) What do you mean by Index hunting?

Indexes help in improving the speed as well as the query performance of database. The procedure of boosting the collection of indexes is named as Index hunting.

23) Define B-trees.

A data structure in the form of tree which stores sorted data and searches, insertions, sequential access and deletions are allowed in logarithmic time.

24) Differentiate Table Scan from Index Scan.

Iterating over all the table rows is called Table Scan while iterating over all the index items is defined as Index Scan.

25) What do you mean by Fill Factor concept with respect to indexes?

Fill Factor can be defined as being that value which defines the percentage of left space on every leaf-level page that is to be packed with data. 100 is the default value of Fill Factor.

26) Define Fragmentation.

Fragmentation can be defined as a database feature of server that promotes control on data which is stored at table level by the user.

27) What is Database partitioning?

Division of logical database into independent complete units for improving its management, availability and performance is called Database partitioning.

28) Explain the importance of partitioning.

Splitting of one table which is large into smaller database entities logically is called database partitioning. Its benefits are:

  • To improve query performance in situations dramatically when mostly rows which are heavily accessed are in one partition.
  • Accessing large parts of a single partition
  • Slower and cheaper storage media can be used for data which is seldom used.

29) Define Database system.

DBMS along with database is called Database system.

30) What do you mean by Query Evaluation Engine?

Query Evaluation Engine executes the low-level instructions that are generated by the compiler.

31) Define DDL Interpreter.

DDL statements are interpreted and recorded in tables called metadata.

32) Define Atomicity and Aggregation.

Atomicity: It’s an all or none concept which enables the user to be assured of incomplete transactions to be taken care of. The actions involving incomplete transactions are left undone in DBMS.

Aggregation: The collected entities and their relationship are aggregated in this model. It is mainly used in expressing relationships within relationships.

33) Enlist the various transaction phases.

The various transaction phases are:

  • Analysis Phase.
  • Redo Phase
  • Undo Phase

34) Define Object-oriented model.

Compilations of objects make up this model in which values are stored within instance variables which is inside the object. The object itself comprises bodies of object for its operation which are called methods. Objects containing same kind of variables and methods are called classes.

35) Define Entity.

It can be defined as being a ‘thing’ with an independent existence in the real world.