Entries with tag technology tips .

SQL - Basic Queries

In the last article we started learning about some basic ways to write SQL queries.  We'll continue with that now and also start introducing more complex concepts.  We have already written simple queries with the where clause and the following are some other ways in which a condition can be specified:

select ID 
from teacher
where Paycheck >= 2000 and Paycheck <= 5000;
We can see that multiple conditions can be specified in a where clause using logical operators like and, or etc.  A simpler way of writing the previous query would be using the between operator:
select ID
from teacher
where Paycheck between 2000 and 5000;
Another useful feature which can be used when writing SQL queries is the existence of character array comparation operators, which are based on the standard lexicographic character order, just like in standard programming languages (for example, 'abcd' is less than 'b').  For comparing character arrays, the like operator is often used, and it enables comparing a part of the char array with certain parts of a pattern.  For specifying the pattern, symbols '%' and '_' are used.  The '%' symbol corresponds to any arrays of (zero or more) characters, and the '_' symbol corresponds to a single, arbitrary character.  So, let's use the like operator to obtain all names starting with 'Ma':
select Name
from person
where Name like 'Ma%';
The following query selects all subjects which have '1' as a third character in their subject code:
select *
from subject
where SubjectID like '_ _1%';


Sorting Query Results

For displaying the results of a query in a certain order, we use the order by clause.  For example:

select Name, City
from person
order by Name [desc];
The desc modifier is optional, and you use it if you want the data to be sorted in descending order.  The default order is ascending.


Set Operations

SQL supports set operations - union, intersect and except, which are executed in a similar way to the corresponding relational algebra operations.  All of the three operations are binary.  The operands are relations which have to be compatible, i.e. they have to have the same number of attribute, and the the corresponding attributes have to have the same domain. Unlike these operations in relational algebra, SQL set operation results can contain duplicates.  For example, to obtain a list of IDs of teachers who teach at the Computer Science department and/or the Electronics department, we can specify the following query:

(select TeacherID
from teaches
where Department = 'Computer Science')


(select TeacherID
from teaches
where Department = 'Electronics');
The other set operations work in the same way.


SQL - Altering Relations

Last time we learned how to create a relational database schema and relations.  Now we will expand our knowledge about SQL.  Deleting objects of the SQL schema can be performed using the drop command.  For example, the following statement deletes the table test_table in its entirety (both the data - table entities, and the definition - relational schema):

drop table test_table;
Deleting the entire SQL schema is performed using the following statement:
drop schema UniSchema;
There are two options of the drop statement which can be specified: cascade and restrict.  The restrict option prevents the table from being deleted if the table is referenced, for example, by a foreign key of another table.  Unlike that, the cascade option deletes not only the specified table, but also the tables which reference the given table.  Options are specified at the end of the drop statement.


Existing definitions of SQL schema objects can be altered using the alter command.  For example, if we want to add the dateOfBirth attribute to the person table, we can do that by writing:

alter table person add dateOfBirth date;
As you can see, it is necessary to also state the type of the attribute we are adding.  An attribute can also be removed by using the alter and drop commands:
alter table person drop dateOfBirth [restrict/cascade];
Deletion of attributes can also be performed while specifying restrict or cascade option.


SQL - Basic Queries

As previously stated, a relational database can (in an informal and simplified manner) be viewed as a collection of tables.  One of the most important things a user (or administrator) expects from a database is a simple way of acquiring and manipulating table data, and representing it in a readable format.  This is achieved by writing and executing queries.
Queries are non-case sensitive (except in an array of characters), and multiple statements are separated with the ; sign.  The basic form of an SQL query is:

-- this is a comment

select A1, A2,...       -- A1, A2 etc. are attributes 
from r                  -- r is a relation (table)
where P;                -- P is a predicate
For example, if we want to display the names of all the classes in a department (from our UniDB example), we will execute the following query:
select name
from class;

-- we can also specify the distinct keyword to retrieve unique names:
select distinct name
from class;
If we want to retrieve only those attributes which satisfy a certain condition, we can also specify a predicate (condition):
select name
from class
where credits>5;
This was the basic and most simple form of an SQL query, and in the following articles we will discuss more complex queries.


SQL - Creating relations

When we talked about datatypes in SQL, we said that we can directly associate an attribute with a predefined data type.  Another way of specifying a domain is using the SQL create domain statement:

create domain NAME as char(15);
This creates a domain and assigns a name to it.  Now we can use that domain when defining a new attribute, instead of using a predefined SQL type.  A useful consequence of using this kind of approach is the fact that, if, for example, we need the NAME domain to be extended to char(20), we only need to make one change - at the declaration of the domain.


In SQL, permanent relations - tables - are created using the create table statement.  The statement must include the specification of the table name and attribute names and domains.  Optionally, the statement can also include constraint specifications, such as primary or foreign key constraints.  A typical form of the create table statement is:

create table table_name
   column_name1 datatype [constraint for column1]
   column_name2 datatype [constraint for column2]
   column_name3 datatype [constraint for column3]
   [integrity constraint 1],
   [integrity constraint 2]
where table_name is the name of the relation (table) being created, column_name1 the name of an attribute, and datatype a predefined datatype (with optional specification of precision, scale value or length) or the name of a declared domain.  Constraints for columns are optional and most often contain the not null specification, which means that null values are not allowed.


Other constraints regarding the value of columns (attributes) can contain specifications of uniqueness, referential integrity (referencing another table's [primary] key) or other limitations which can be specified using the check constraint.  Typical integrity constraints are:

  • Uniqueness specification with the following form:
    primary key (column_name1, column_name2, ...)
    This means that the values on the attributes which constitute the primary key must be unique (they are by default, and must always be, not null).

  • Referential integrity (foreign key specification):
    foreign key (list of referencing columns) references name_of_ref_table(list of referenced columns)
    where list of referencing columns is in the form of column_name1, column_name2, ..., and the listed columns are the attributes constituting the foreign key.  The list of referenced columns is a list of attributes from the name_of_ref_table table referenced by the foreign key.

  • Check constraint, in the form of:
    where P is a predicate/condition which has to be satisfied for each entity of the table.


Now we will see a couple of examples - statements for creating relations of the uni database we modeled earlier:

create table faculty
   FacName char(30) not null, 
   Address char(50),
   primary key(FacName)

create table phone
   FacName char(30) not null,
   FacPhone char(10) not null,
   primary key(FacName, FacPhone),
   foreign key(FacName) references faculty(FacName)

create table person
   PersonID long not null,
   Name char(50) not null,
   DateOfBirth date not null,
   Address varchar(50),
   primary key(PersonID)
In the following articles we will cover authorization and privileges, editing/deleting relations and attributes, and start learning about basic SQL queries.


SQL - Structured Query Language

Before we start talking about SQL, we first need to define some basic concepts related to relational databases.  One of these concepts is a relation - a relation is analogous to an entity type from the entity-relationship model.  Later, we will also see that in fact, a relation is one table in a database.  Similarly, a relational schema is a collection of relations, and represents the entire database.  The reason we are mentioning this is the fact that the entire concept of relational databases has a mathematical foundation.  All of the operations we will be performing on tables (relations) are based on a special field of mathematics called relational algebra.  We will not focus on the mathematics in this series, but it is important to know what a relation is and understand that every operation is strictly defined.  From this point on, we will be using the terms table and relation synonymously, for the sake of simplicity.

SQL (Structured Query Language) is a powerful, high-level declarative language created for working with relational databases.  The first commercial implementation of SQL was created by Oracle (then called Relational Software, Inc.) in 1979.  The language quickly became popular, and thanks to many implementations, constant development and numerous upgrades, the specification of SQL (standardized by ANSI) became too complex for practical purposes.  This is why today, the SQL standard demands that database management system (DBMS) manufacturers declare the level of conformance with the standard.  The minimal level of conformance required is called SQL core conformance, but unfortunately, due to various reasons, even the most significant DBMSs do not meet these requirements.  Still, standardization did enable a high level of portability between systems with different DBMSs.

Basic SQL Data Types

SQL has the following basic, predefined data types: numeric types, character strings, date/time types, and logical type.
Numeric types can be divided into two groups - exact and approximate.  Examples of exact numerical types in SQL are:

  • smallint - an integer data type whose domain depends on the implementation.  Most commonly the number of bits used for representing this type is 16.  SQL only specifies that the precision of smallint is less than the precision of int.
  • int - similarly to smallint, this data type is usually represented with 16 bits.
  • bigint - similarly to the above, this data type is usually represented with 19 bits.
  • numeric(p,s) - a numeric data type for representation of decimal numbers. p signifies the total number of digits, and s signifies the number of digits after the decimal point.  For example, if data is specified as numeric(4,2), it can contain the value of 15.99.
Approximate data types are:
  • real/double precision
  • float(n) - an approximate numeric type with a binary precision of at least n binary digits.
Character strings are fixed or variable size data types which contain char arrays.  Examples are:
  • char(n) - specifying a fixed size character array. If n is left out, the default size is 1.  If the string contains less than n elements, the remaining places are filled with white spaces.  Constant char arrays are specified by single quotes, for example, 'myTestString'.
  • varchar(n) - specifying a variable size character array - the length can range from 0 to n characters.
Date/Time data types are used to define data related to calendar dates, timestamps or time intervals.  Date/time data types are:
  • date - contains day, month and year in the following format: DD-MM-YYYY.  Constant values can be specified using the date keyword, like date'28-06-2015'.
  • time - similarly, time is formatted like HH-MM-SS, and the time keyword is also available.
  • timestamp - combines the date and time.  Constant calues can be specified using the timestamp keyword like this: timestamp'28-06-2015 13:22:05'.

The logical (boolean) type has the standard values of true and false.

You can see that these types are quite similar to those often found in "regular" programming languages.  In the next couple of articles we are going to continue learning about SQL, create domains, relations, restrictions and CRUD operations.  After we have done all that, we should be able to write complex SQL triggers and stored procedures, which can be a powerful tool for managing a relational database.

Entity-Relationship Diagram (Example)

In the database modeling series so far we have covered a lot of theory concerning the entity-relationship model and creating diagrams.  Now we are going to give an example of an entity-relationship diagram, which will represent a simplified model of a university center.  We will briefly explain what this model contains, and then show an image of our diagram.  Everything on the image (symbols, notation) is standardized and only the things we have covered in the previous articles have been used.

Note: You can create your diagrams using a specialized tool like ERwin, or using a multitude of online tools, for example draw.io.  Since we are using standard notation, the environment shouldn't play any role in creating or displaying a model.  An advantage of using specialized software is the opportunity to automatically generate an SQL schema from the model, but we will come back to that later.

As we said, this is a significantly simplified model of a university center.  We will use this example throughout the series because it is easy to understand and diverse enough to illustrate all the concepts we are learning.  However, if this kind of a database was to be modeled in a real-world system, it would contain many more entities, more complicated structures, more complex relationships, etc.  Taking into consideration all of the details when modeling a relational database is no simple task.  For now, we will be using a basic model: a university center can have multiple universities, which are identified by ther names.  Some other information we are storing about a university is its address and telephone number.  Each university can have multiple faculties, and each faculty can have multiple departments.  A department contains classes, and each class is taught by a teacher.  In the model, you can see that Teacher is a specialization of the Person entity, and so is a Student (Generalization/Specialization is a concept we have learned about in ... ).  A person is identified by an ID, and other stored data is the person's name, address and date of birth.  This automatically means that students and teachers are also identified by ther ID, and all of the mentioned attributes are also attributes of their entity types, since they are "inherited" from the Person entity type.  A student can be enrolled in multiple classes, and take exams.  An important thing to notice in this model is that Exam is a weak entity type.  It is uniquely identified not only by the date, but also with the primary key of the Class entity type.  This means that the primary key of the weak entity type Exam is a combination of the exam's date and the class's ID.  This is logical and intuitive, since exams in different classes can take place the same day.

This was a brief explanation of the diagram, which should be enough to understand all of the concepts used in this model.  We recommend trying to create your own simple model and its diagram using a tool of your choice, because in the next couple of articles we are going to move on, and introduce the SQL language.  Following is an image of our university center database model: