Showing posts with label computer engineering bachelor's degree. Show all posts
Showing posts with label computer engineering bachelor's degree. Show all posts

Regular Expression Engine in C# (the Story)

A “long time ago”, more precisely 3 years ago, I was studying Automata and Formal Languages which was a Computer Engineering discipline in the 6th semester out of 10 at UBM.

At that time I was amazed by the new things I was learning such as NFA - Nondetermistic Finite Automaton, DFA - Deterministic Finite Automaton, Finite State Machine and Regular Expressions. For more on this, read my last post: Fortran Numerical Constants Recognizer.

For the sake of my development I started searching for programming related material that could put me informed about such amazing computer science constructs.

I then came across a series of articles at GameDev.net called Algorithmic Forays. It’s been written by Eli Bendersky. This guy did a great job putting together the base for a Regular Expression Engine that exercise the concepts of NFA and DFA. His code was presented in C++ and I took the time to learn a bit more about this powerful language too.

After reading Algorithmic Forays from part 1 through part 6 I started thinking about translating the code from C++ to C#. That’d be a great opportunity to grasp new C# constructs and at the same time get in touch with material related to the discipline afore mentioned.

Setting that in my mind I put my hands at work and after struggling here and there I got the code translated. I also thought about writing a Portuguese version of the article to be publicized at my university’s scientific magazine. The idea was to write an adapted version of the article presenting both code (C++ and C#) showing how to achieve the same results using different programming languages data structures.

I’ve sent a message to Eli asking him permission to write a Portuguese version of the article. You can read the e-mail I’ve sent Eli and his reply below:

Gmail
Leniel Macaferi <le@…>


Algorithmic Forays for a Regex Parser

le@... <le@...>                                                              Wed, Jan 3, 2007 at 5:43 AM

To: eli…@…

Cc: Wellington Magalhães Leite <wml@…>

Dear Eli,

My name is Leniel and I'm a student of Computer Engineering here in
Brazil. I'm currently in the 9th term out of 10.
You can see my basic profile at
http://www.codeproject.com/script/profile/whos_who.asp?id=1224713

Today I'm on my vacation and I'm going to come back studying on February.

I'm passionate for anything related to technology and specially
programming which is my hobby just like yours.

Firstly I'd like to congratulate you by your advanced knowledge of
such great topics about Computer Science and Computer Engineering like
the one you discussed in the article Algorithmic Forays - Part 1 to 6.

Well, I've read your biography at http://eli.thegreenplace.net/about/
and I'm currently seeing your photo album. You're a great example for
us to follow due your characteristics as liking to study and liking to
travel the whole world. It's just what I want to do too as a
professional and as a traveler! You really do very well on this.

Your site is all of good! Think about the entire internet filled with
sites just like yours. With a complete bunch of interesting
information! My God it's a dream.

Coming to the real reason I'm writing to you - it's about a project
that has been proposed to me and a friend of mine called Wellington.
When we were in the 7th term at the university during our class of
Laboratory of Compiler Construction studying Lex and Yacc, our teacher
asked us to prepare an article about the process of creation of an
engine to parse Regular Expressions. Since that moment we started
searching the Internet for something that we could use as a base for
our article. For our lucky, we found your article Algorithmic Forays
at www.gamedev.net that is amazing for everyone to understand. It's
simple and direct with implementation examples.
During the 6th term at the university we had a discipline called
Automatons and Formal Languages. Was at this period that we learned
topics regarding Finite State Machines (FSM), Deterministic Finite
Automatons (DFA), Nondeterministic Finite Automatons (NFA) and Regular
Expressions. So, from there we know that a DFA + NFA = FSM! We've also
implemented some code for covering and getting a better understanding
of how such topics work in reality.

In true, we started our project preparing an article on April, May of
2006, when I particularly started converting all your Algorithmic
Foray's C++ code to the C# language that is the one I most like to use
for writing any kind of code. But due our other tasks, in other words,
other disciplines, we stopped working on this project. It's important
to mention that I could finalize translating your C++ code to C# on
the middle of May 2006.

I've learned such great things working with the code as mastering my
knowledge with the Generics in C#. I even got one third party DLL
called C5.dll from http://www.itu.dk/research/c5/ which includes lots
of new classes to work with sets of data as is necessary when dealing
with regular expressions.

Yesterday, 1/2, when I woke up I just thought why not to continue
working on that project about a regex parser. So, I started again with
new motivations.

I started translating your whole article from English to Portuguese
and I'd like to ask you if we could use some text and diagrams of your
article as the base of our article and we would present it with both
versions of code, C++ and C#.

Our aim is to forward it to be issued in our university's scientific
magazine and propagate to the ones interested – see
http://www.ubm.br/ubm/paginas/copep/inicial.php?pagina=apresentacao.php
(it's in Portuguese - you're learning Spanish and it's a little bit
alike. I've already publicized one article in this magazine with the
title: Development and Numeric Simulation of Algorithms for
Computational Resolution of Ordinary Differential Equations. It was a
project I developed while I was in the 4th term. The code was
developed using C++. Further, this article will be available in the
Internet through an indexing project that's been prepared by a
university here in Brazil.

We'll certainly cite the article's fonts as you can see in the
preliminary code I'm going to send you in this e-mail. Your great
initiative of publicizing the article must be preserved by any mean.
Haven't you used any other bibliography? Was it developed by just your
thoughts?

Remember that this code I'm sending you needs a review because I have
to pass through it to get it all in mind again and see if everything
is Ok. To run the code you just need to recompile it using a C#
compiler. I use the Microsoft Visual Studio 2005 Express Edition as
you mentioned in your home page here
http://eli.thegreenplace.net/2006/12/03/a-complete-c-development-environment-from-microsoft-free/
The initial parameter I'm using is set in the command line argument
inside the project's properties in Debug and is as follow:
RegularExpressionEngine "(a|b)*abb" aaababb
It's seems to be working Ok.
Try it for yourself!

From now on we're just waiting for your reply.

By the way, if you and your wife want to come to visit Brazil you can
certainly come to my house. I'd like to have you as my guests. Who
knows, one day I can go to Israel and meet you personally. That would
be great too. Israel is the apple of God's eye!

I simply think we have the same liking!

It was a pleasure to write to you.

My best regards,
                         Leniel Braz de Oliveira Macaferi
                         Volta Redonda, Rio de Janeiro, Brazil


Gmail
Leniel Macaferi <le@…>


Algorithmic Forays for a Regex Parser

Eli Bendersky <eli@…>                                                Sun, Jan 21, 2007 at 2:03 AM

To: "le@…" <le@…>

Hello Leniel,

First of all, thanks for all the compliments - I'm flattered.
You can certainly use any part of the articles and the diagrams for your translation. As for my sources, I certainly didn't invent anything - I took all the material from the "Dragon Book" ("Compilers" by Aho, Sethi and Ullman), perhaps with the help of some googling, but as far as I remember, the book was the main source.

Keep being enthusiastic about your studies. That's a good thing :-)

Regards,

Eli


As you can see from the above e-mail, Eli agreed and so it was OK to write a Portuguese version of his articles.

As the semesters passed and new commitments queued I stopped translating the articles and I have just decided this past week that now it’s a good time to come back to it and finish the planned objective of 2 years ago.

While I write the next blog posts I’ll translate the remaining articles to Portuguese. I expect to publish it online here on this blog so that it can help people out there to learn this fascinating subject of Computation Theory.

I’ll follow a different path to explain how the regular expression engine work, that is, from part 1 to part 6 I’ll be showing each division of the engine by means of its representing class in code. I’ll do so, so that I can present each part of the translated code (C++ to C#) in such a way that it gets easier to understand. I’ll use a top down approach, that is, I’ll begin showing the higher level class and then I’ll go deep to its dependencies.

This is the story behind the Regular Expression Engine in C# I’ll be presenting in the next posts, starting with this one.

Hope that you like it as much as I do. This was for sure one of or even the best discipline I’ve had in the Computer Engineering course and one of the more exciting things to write about.

Updated on 5/12/2009 09:41:00 PM

As I finished writing the series of posts, here goes the list that points to them:

Regex engine in C# - the Regex Parser
Regex engine in C# - the NFA
Regex engine in C# - the DFA
Regex engine in C# - matching strings

Source code: https://github.com/leniel/RegexEngine

Fortran Numerical Constants Recognizer

Compilers construction paper
I and a dear brother in faith of mine called Wellington Magalhães Leite wrote a paper titled: Fortran Numerical Constants Recognizer

It exercises concepts related to state transition diagram, finite state machine and automata theory.

Needless to say, compilers construction was one of the best disciplines or even the best discipline I had in the computer engineering course. I really like this kind of stuff. State machines and automata theory really rocks! Automata theory is other discipline I had that really got my attention.

State Transition Diagram
This is the lexical state transition diagram in which our work is based:

Fortran Numerical Constants - State Transition Diagram

Alphabet = {d, +, - , . , E} (d is any digit)

The arches of any vertex to an Error vertex aren’t showed but they exist.

Expressions like the following are recognized: 13, +13, -13, 13., 1.25, .25, -.25, -32.43, 13E-15, 13.E-15, -13.25E+72, .75E5, etc.

Expressions like the following aren’t recognized: ++13, .E13, etc.

The vertex with a 0 (zero) value is the initial state.

The vertexes with bold circles are the final states.

Note: It’s important to note that lexical state transition diagrams are finite automatons.

Abstract
A didactic method for the construction of a compiler front-end is the one substantiated in transition diagrams. So, its construction helps with the familiarization regarding the tasks of a compiler project.

This paper presents a Fortran numerical constants recognizer. It is developed based on a state transition diagram and its implementation follows the standards of the C# programming language.

Keywords: fortran, compiler construction, state transition diagram, C# programming language

CONTENTS
1 INTRODUCTION 6
  1.1 Objective	6
  1.2 Definition 6
2 DEVELOPMENT 7
  2.1 Mapping the constants 7
  2.2 Mapping the functions 7
  2.3 Application main entry point 10
3 APPLICATION 12
  3.1 Validating expressions 12
      3.1.1 Single expression 12
      3.1.2 Set of expressions 13
4 CONCLUSION 14
5 REFERENCES 15
6 ADDENDUM 16

Resumo
Um método muito didático para a construção do front-end de um compilador é aquele fundamentado em diagramas de transição. Logo, seu estudo ajuda na familiarização com as atividades envolvidas no projeto de um compilador.

Esse trabalho apresenta um sistema reconhecedor de constantes numéricas em Fortran. O mesmo é desenvolvido a partir de um diagrama de transição de estados e sua codificação segue os padrões e moldes da linguagem de programação C#.

Palavras-chave: fortran, construção de compiladores, diagrama de transição de estados, linguagem de programação C#

SUMÁRIO
1 Introdução 5
   1.1 Objetivo	5
   1.2 Definição 5
2 Desenvolvimento 6
   2.1 Mapeamento das constantes 6
   2.2 Mapeamento das funções 6
   2.3 Mapeamento da chamada principal 10
3 Aplicação 12
   3.1 Validação de expressões 12
       3.1.1 Uma única expressão 13
       3.1.2 Uma lista de expressões 14
4 Conclusão 15
5 Bibliografia	16
6 Anexo 17

Source code

class StateTransitionSystem
{
  public static void Main(string[] args)
  {
    Console.Title = "State Transition System for recognizing numeric constants in FORTRAN";

    Console.BackgroundColor = ConsoleColor.White;
    Console.ForegroundColor = ConsoleColor.Black;

    char ch;

    do
    {
      Console.Clear();

      // Print startup banner
      Console.Write("\nState Transition System C# Sample Application\n");
      Console.Write("Copyright ©2006 Leniel Braz de Oliveira Macaferi & Wellington Magalhães Leite.\n\n");
      Console.Write("UBM COMPUTER ENGINEERING - 7TH SEMESTER [http://www.ubm.br/]\n\n");

      // Describes program function
      Console.Write("This program example demonstrates the State Transition Diagram's algorithm for\n");
      Console.Write("numeric constants validation in FORTRAN.\n\n");

      // Describes program's options     
      Console.Write("You can validate expressions by two different ways as follow:\n\n");
      Console.Write("1 - A single expression by providing an entry.\n\n");
      Console.Write("2 - A set of expressions by providing an input file.\n");
      Console.Write("  * Notice: the expressions must be separeted in-lines.\n");

      Console.Write("\n\nEnter your choice: ");

      ch = Console.ReadKey().KeyChar;

      switch(ch)
      {
        case '1':
          {
            SingleExpression();
            break;
          }
        case '2':
          {
            SetOfExpressions();
            break;
          }
      }
    }
    while(ch == '1' || ch == '2');
  }

  /// <summary>
  /// Enumeration where each item corresponds to one state in the State Transition Diagram.
  /// </summary>
  public enum PossibleStates
  {
    s0 = 0,
    s1,
    s2,
    s3,
    s4,
    s5,
    s6,
    s7,
    error
  }

  /// <summary>
  /// Array of type PossibleStates, which contains the finalStates acceptable by the
  /// State Transition Diagram.
  /// </summary>
  public static PossibleStates[] finalStates = { PossibleStates.s2, PossibleStates.s3, PossibleStates.s6 };

  /// <summary>
  /// Verifies if the last state achieved by the machine belongs to the finalStates
  /// array.
  /// </summary>
  public static bool IsFinalState(PossibleStates currentState)
  {
    for(int i = 0; i < finalStates.Length; i++)
      if(currentState == finalStates[i])
        return true;

    return false;
  }

  /// <summary>
  /// Recognizes the current state and the character “label” being analysed, values
  /// passed as parameters. After, the function does the transition of state case some
  /// condition is satisfied, otherwise, the function will return an error flag.
  /// </summary>
  public static PossibleStates Recognizer(PossibleStates currentState, char c)
  {
    switch(currentState)
    {
      case PossibleStates.s0:
        {
          if(c == '+' || c == '-')
            return PossibleStates.s1;

          if(char.IsDigit(c))
            return PossibleStates.s2;

          if(c == '.')
            return PossibleStates.s4;

          break;
        }

      case PossibleStates.s1:
        {
          if(char.IsDigit(c))
            return PossibleStates.s2;

          if(c == '.')
            return PossibleStates.s4;

          break;
        }

      case PossibleStates.s2:
        {
          if(char.IsDigit(c))
            return PossibleStates.s2;

          if(c == '.')
            return PossibleStates.s3;

          if(c == 'E')
            return PossibleStates.s5;

          break;
        }

      case PossibleStates.s3:
        {
          if(char.IsDigit(c))
            return PossibleStates.s3;

          if(c == 'E')
            return PossibleStates.s5;

          break;
        }

      case PossibleStates.s4:
        {
          if(char.IsDigit(c))
            return PossibleStates.s3;

          break;
        }

      case PossibleStates.s5:
        {
          if(char.IsDigit(c))
            return PossibleStates.s6;

          if(c == '+' || c == '-')
            return PossibleStates.s7;

          break;
        }

      case PossibleStates.s6:
        {
          if(char.IsDigit(c))
            return PossibleStates.s6;

          break;
        }

      case PossibleStates.s7:
        {
          if(char.IsDigit(c))
            return PossibleStates.s6;

          break;
        }
    }
    return PossibleStates.error;
  }

  /// <summary>
  /// Reads an input expression, recognizes its characters, changes the states
  /// accordingly to those characters and hence validates the entry.
  /// </summary>
  public static void SingleExpression()
  {
    do
    {
      // The machine points to the initial state of the State Transition Diagram.
      PossibleStates currentState = PossibleStates.s0;

      Console.Write("\n\nEnter the expression to be evaluated: ");

      // strExpression receives the entry typed by the user.
      string strExpression = Console.ReadLine();

      /* For each string's character (label), calls the function Recognizer that
      on the other hand changes the machine state accordingly. */
      for(int i = 0; strExpression.Length > i; ++i)
        if(currentState != PossibleStates.error)
          currentState = Recognizer(currentState, strExpression[i]);
        else
          break;

      /* Calls the function IsFinalState to verify if the state where the machine
       stopped is a final state or not. */
      if(IsFinalState(currentState))
        Console.WriteLine("\n Valid expression.\n");
      else
        Console.WriteLine("\n Invalid expression!\n");

      Console.Write("Do you wanna try again? (y\\n) ");
    }
    while(Console.ReadKey().KeyChar == 'y');
  }

  /// <summary>
  /// Reads an input file, recognizes its lines, expression by expression and changes
  /// the states accordingly to each expression. In other words, validates the entire
  /// list.
  /// </summary>
  public static void SetOfExpressions()
  {
    do
    {
      Console.Write("\n\nEnter the file path: ");

      // Obtains the file name.
      string fileName = Console.ReadLine();

      // Verifies if the file exists.
      if(!File.Exists(fileName))
        Console.Write("\n File not found!\n\n");
      else
      {
        // Reads all the file's lines and stores them.
        StreamReader sReader = new StreamReader(fileName);

        string expression;

        // Evaluates each line until achieve the EOF (end of file).
        while((expression = sReader.ReadLine()) != null)
        {
          // The machine points to the initial state of the State Transition Diagram.
          PossibleStates currentState = PossibleStates.s0;

          /* For each expression's character (label), calls the function Recognizer that
          on the other hand changes the machine state accordingly. */
          for(int i = 0; expression.Length > i; ++i)
            if(currentState != PossibleStates.error)
              currentState = Recognizer(currentState, expression[i]);
            else
              break;

          /* Calls the function IsFinalState to verify if the state where the machine
           stopped for the expression is a final state or not. */
          if(IsFinalState(currentState))
            Console.WriteLine("\n{0} is a valid expression.", expression);
          else
            Console.WriteLine("\n{0} is an invalid expression!", expression);
        }
        sReader.Close();
      }
      Console.Write("\nDo you wanna try again? (y\\n) ");
    }
    while(Console.ReadKey().KeyChar == 'y');
  }
}

Screenshots
See screenshots of Fortran numerical constants recognizer in action:

Fortran Numerical Constants Recognizer - Single Expression

Fortran Numerical Constants Recognizer - Set Of Expressions

You can get a PDF copy of the paper and the accompanying Fortran Numerical Constants Recognizer files in a ZIP file at:

English version
http://leniel.googlepages.com/FortranNumericalConstantsRecognizer.pdf

Portuguese version
http://leniel.googlepages.com/SisReconhecedorConstNumericasFortran.pdf

Microsoft Visual Studio C# 2008 Console Application project
http://leniel.googlepages.com/FortranNumericalConstantsRecognizer.zip

Note: The program’s source code is in a Microsoft Visual Studio C# 2008 Console Application project. If you don’t have Visual Studio, you can get a free copy of its express edition at http://www.microsoft.com/express/download.

Chemtech compliments newly graduated engineers

Last Friday, January 16th, I was complimented by Chemtech. The company gave compliments to all employees that finished their college course in 2007/2008.

Chemtech compliments engineers 
The computer engineers including me beside Rubião from right to left.

We got together at the 23rd floor of Rio de Janeiro’s office where Luiz Eduardo Ganem Rubião (CEO and founder of Chemtech) talked a little bit about the company’s perspectives giving us some insight related to life, economy, the job market and ongoing and future projects.

The way Rubião thinks about life subjects, mainly being an optimist is how I face the day to day. There’s no time to be wasted with illusions and we should always think positive. Doing so we’ll for sure harvest good fruits.

The following is the transcription of the letter I received with a present (a book of my area of specialization - computer engineering) …

Rio de Janeiro, January 16th, 2009

Dear Leniel Braz de Oliveira Macaferi,

Young people like you represent the future of technology. With your know how you can contribute to Brazil’s development so that it stands out in the worldwide technology. It is in the knowledge, in technology that a differential arises.

Chemtech is proud to have been with you in your first steps to engineering and for us still being together. It’s with satisfaction that today we call you an engineer! A Chemtecheano engineer, a Brazilian engineer.

We’re together in this journey that starts with your graduation. Success!

A strong hug from the Chemtech family.

Nanotechnology and the future of technology

Nanotech Nanotechnology refers to a field of applied science and technology whose theme is the control of matter on the atomic and molecular scale, generally 100 nanometers (billionths of meters) or smaller, and the fabrication of devices or materials that lie within that size range. Is any technology that is based on the placement or manipulation of single atoms.

Many innovations will come to light, which will make extensive use of nanotechnology. We know little about the natural phenomena that are surrounding us. In truth everything is already made and is near us, but we just can’t see because we need in the course of time develop our science and create new tools that will make us capable of discovering new chemical elements.

There is a famous maxim from French chemist Lavoisier that is:
In nature nothing is lost, nothing is created, everything is transformed.


We are in a constant process of evolution and development.

With the discovering of new chemical elements and inherent natural phenomena, we’ll be capable of creating new types of materials what on the other hand will bring over more and more discoveries. Discoveries lead to innovations.

If we think that we are working with minuscule particles and that the smaller particle hasn’t been discovered yet, we can assert that we have a lot to learn. In truth it wasn’t long since that the first chemical elements were discovered.

Now it’s interesting to catch sight of how many nice opportunities there are to use nanotechnology. As an example: the manufacturing process of computer processors. I just can’t wait to have a super fast computer. To that end it’s necessary that nanotechnology evolves. For a faster processor it is necessary that billions of small electrical components called transistors be placed in a microchip. The smaller the transistors the greater amount of them can be put in a single chip. There’s a law specific to this subject called Moore's law.

New techniques can also be applied in the medicine field with the development of robots invisible to the human eyes that can flow within the human body fighting against all sorts of diseases.

At last, nanotechnology is a really important and promising technology and I expect that it evolves rapidly for the sake of men’s well being of course because other forms of use also exists. I won’t comment about them here. I think that the reader caught what I want to express with this. If not, try to remember about war technologies.

I can foresee that in a time period of 40 years (approximately 2050) we’ll be in a new baseline and nanotechnology will be a completely forgotten technology. As a matter of fact, it always happens with technologies.

As of the date of this post we already have 11 nanometers technology. For comparison, the processor Intel Core Duo that is the one I use today is manufactured with a 65 nanometers technology.

Enterprise Management Software

GetToTheTargetThe crescent demand for enterprise management software has as key factor big companies, which need to manage millions of records trying with that: find a better way of working with so great amount of information available in their huge databases and get to the target of their business.

Between the most used software for enterprise management are the ones that lie in one of the following main categories:

  • Enterprise Resource Planning (ERP)
  • Customer Relationship Management (CRM)
  • Supply Chain Management (SCM)
  • Balanced Scorecard or Scoreboard (BSC)

ERP

Resources planning in a company (ERP) include or try to include all the company’s data and processes in a unified system. A typical ERP system will use multiple software and hardware components to achieve integration. A key ingredient for most ERP system is the use of a unified database to store the data from the various modules of the system.

CRM

Customer relationship management uses software products developed to help the companies in the construction and keeping of a good relationship with their clients. This is done through the storage of information of each client in an efficient and intelligent way. CRM is a general term that encompasses concepts used by the companies to manage their clients and includes capture, storage and analysis of data, suppliers, partners and internal processes information.

SCM

Supply chain management is the process of planning, implementing and controlling the operations of the supply chain to the highest efficiency degree possible. SCM includes all the transport logistics and storage of raw material and ready products from the origin to the destiny consumption market.

BSC

Balanced scorecard or scoreboard is a concept used to measure if the activities of a company are yielding expected results in terms of the strategic vision. Its focus isn’t only the financial result but also human questions. It helps in the provisioning of a more comprehensive vision of an organization what in its turn helps the companies to act according to the best way possible in a long term basis.

How is it used?

All these software work integrated.

ManagementReport The chairman, VP, CFO, CEO, CIO, CTO, etc., simply sat down in front of their computer and request a report. The systems then combine the data stored in unified databases, which have all the company’s historical data using technologies such as OLAP and OLTP. Business intelligence (BI), artificial intelligence (AI), data mining, etc, are concepts extensively applied to do the junction of enterprise data.

It has all to do with relevant data and how to better combine them through (SQL queries) to extract useful information. Information = processed data = money! :-)

My computer engineering final project verses about LINQ (Language Integrated Query), that is, the integration of the query language (SQL) into the programming language (specifically C#). It's a good starting point for those that wanna learn the principles of working with queries to extract useful information from databases.

What to expect from the future?

The good part of it is that this market is growing in a fast pace. This way, a lot of job opportunities are emerging.

I particularly have interest in this field: databases, computer networks, BI, HPC, etc. My area of specialization will be for sure system analysis with focus on software engineering and infrastructure.

I expect to work in this field in a near future. It’s really interesting and challenging because there are always details that can be improved, optimized.

Information management (IM) is the key to the proper organization of a company. If a company organizes its data, its income will grow. It’s a proved fact.

Just as a note: one alarming factor that can arise is the security related to all the information of a company what creates the field of information security.

A new area that is attracting attention nowadays is parallel computing, most precisely, high performance computing, which is other very important factor when dealing with huge sets of data as is the case of enterprise management software.

ParallelDataComputing

Parallel processing will be the focus on the next years because it is necessary for the next step toward fastest software processing. Long live the dual-quad (many) core processors.

Quicksort and Binary Search algorithms in C++

This post is related to one of the best coursework I've done during the computer engineering course. I'm really proud of it. It was the cornerstone in my programming career and helped me choose an area to put my efforts from that moment on. I was in the 5th term out of 10 (3rd year of the course out of 5 more exactly). The discipline was Programming Languages.

This subject is fantastic and is used extensively throughout the the computer science field.

First I'll give a short description about the Quicksort and Binary Search algorithms and then I'll present the work that I and my dear brother in faith Wellington Magalhães Leite did.

Quicksort

Quicksort is a well-known sorting algorithm developed by C. A. R. Hoare. Typically, quicksort is significantly faster in practice than other sorting algorithms, because its inner loop can be efficiently implemented on most architectures.

Binary Search

A binary search algorithm (or binary chop) is a technique for finding a particular value in a sorted list. It makes progressively better guesses, and closes in on the sought value by selecting the median element in a list, comparing its value to the target value, and determining if the selected value is greater than, less than, or equal to the target value. A guess that turns out to be too high becomes the new top of the list, and a guess that is too low becomes the new bottom of the list. Pursuing this strategy iteratively, it narrows the search by a factor of two each time, and finds the target value.

Our paper

Our paper is entitled Quicksort and Binary Search Algorithms. You can get a copy at the end of this post.

Without more ado, see its abstract bellow:

Sorting and searching algorithms are a core part of the computer science area. They are used throughout the programming work when you need to sort a set of data and when you need to search for a specific record (key) present in such set of data.

Quicksort is one of the fastest (quick) sorting algorithms and is most used in huge sets of data. It performs really well in such situations.

Binary search tree is one of the fastest searching algorithms and is applied in a sorted set of data. It reduces the search space by 2 in each iteration, hence its name (binary).

In this paper we present the intrinsic nature of each algorithm as well as a functional implementation of such algorithms in the C++ programming language.

Keywords: quicksort, binary search, sorting algorithms, searching algorithms, c++ programming language

CONTENTS
1 INTRODUCTION 6
  1.1 Objective 6
  1.2 Definition 6
      1.2.1 Sorting algorithms 6
      1.2.2 Searching algorithms 7
2 DEVELOPMENT 8
  2.1 Quicksort algorithm (swapping and partitioning) 8
      2.1.1 Detailed steps 8
  2.2 Binary search algorithm 9
  2.3 Studying the efficiency of the methods 9
      2.3.1 The big O notation 9
      2.3.2 Quicksort efficiency 10
            2.3.2.1 The best and the worst case 10
            2.3.2.2 Comparison with other sorting algorithms 11
      2.3.3 Binary search efficiency 11
3 APPLICATION 12
  3.1 Quicksort implementation 12
  3.2 Binary search implementation 13
4 CONCLUSION 14
5 REFERENCES 15

Words of wisdom

As I mentioned, during the 5th term of the computer engineering course our teacher Marcus Vinicius Carvalho Guelpeli selected some sorting and searching algorithms to pass to the class as a coursework.

The coursework should be done in pairs and each pair should select a sorting and searching algorithm to compose a paper about it. We selected the quicksort and the binary search algorithms. The teacher advised us that these weren't the easy ones. I just thought: that's what I want. I don't want the easy ones. Why? Because if you get just the easy problems I'll never understand something that demands a more deep approach and every time a difficult task is given you'll tend to refuse it. What's the best thing to do? Just accept the challenge and go for it. Chances are you'll succeed. That's just what happened with us.

We haven't just written about the quicksort and the binary search, we implemented it and presented it to the class in a power point presentation. The teacher liked it so much that our grade was the highest possible! :-) In the end what did we feel? An amazing feeling. Something such that the work had been done and we learned a lot from it. That’s what a college and even an online computer science degree program is supposed to do. Give you the subjects and motivate you; teaching the basic so that you can dig up the more difficult aspects of the subject being taught.

So what's next? Well, I'll explain how we implemented the quicksort and the binary search algorithms.

Quicksort and Binary search algorithms implementation in C++

One of the things that I've always listened to was about code reuse. You search for something already implemented so that you just haven't to reinvent the wheel. Of course you'll complement or even adapt the available code to your situation. That was what we did. I found some code for the quick sort at MSDN and implemented the binary search one. Unfortunately the page at MSDN isn't available anymore. It's been three years since I hit that page.

I wanted a way of measuring the time elapsed so that we could compare the efficiency of both methods when they were fed with different input data sets. The input is nothing more than a text file .txt full of numbers in this case. Can be any data you want. Each test case we passed a text file with different random numbers and different quantity of numbers. For example, in a test case we passed a file named 2500.txt, that means 2500 random numbers. In another test case we passed other file named 7500.txt as so on. I think you got it. Doing so we could compare how well the algorithms were performing.

To generate the random numbers we used an Excel spreadsheet with the formula =RAND()*1000000. For each set of data we generated new numbers and copied and pasted those numbers into the text files that are the input for our program. During a coursework as this one we get to learn everywhere, even a new formula in Excel. It's really good. ;-)

Again, I searched for a timing class that I could reuse with the code and for sure I found it. I didn't use it at all but I used it to learn about how to measure time in C++. It's amazing how fast you can implement something. Much of the things you need related to programming are already implemented. You just have to search for it as is what you're doing here, I think! You searched for the subject of this post and here you are seeing something implemented. Try to learn from it and just don't copy the entire work and think that you know about it. It's wrong. Try to understand what the code is doing. Dive into the theory because it explains the inner essence.

The code that follows is well commented which is something every developer should do. You see, it was three years ago when we worked with this code. Today it's difficult to remember every step I took. The comments helped me to remember almost everything.

Bellow I present the quick sort method we borrowed from MSDN (we adapted it to fit our case). Note the use of the Partition method (explained in the accompanying paper):

// QuickSort implementation
void QuickSort(char** szArray, int nLower, int nUpper)
{
 // Check for non-base case
 if(nLower < nUpper)
 {
   // Split and sort partitions
   int nSplit = Partition(szArray, nLower, nUpper);
   QuickSort(szArray, nLower, nSplit - 1);
   QuickSort(szArray, nSplit + 1, nUpper);
 }
}

// QuickSort partition implementation
int Partition (char** szArray, int nLower, int nUpper)
{
 // Pivot with first element
 int nLeft = nLower + 1;
 char* szPivot = szArray[nLower];
 int nRight = nUpper;

 // Partition array elements
 char* szSwap;
 while(nLeft <= nRight)
 {
   // Find item out of place
   while(nLeft <= nRight && strcmp (szArray[nLeft], szPivot) <= 0)
     nLeft = nLeft + 1;
   while (nLeft <= nRight && strcmp (szArray[nRight], szPivot) > 0)
     nRight = nRight - 1;

   // Swap values if necessary
   if(nLeft < nRight)
   {
     szSwap = szArray[nLeft];
     szArray[nLeft] = szArray[nRight];
     szArray[nRight] = szSwap;
     nLeft = nLeft + 1;
     nRight = nRight - 1;
   }
 }

 // Move pivot element
 szSwap = szArray[nLower];
 szArray[nLower] = szArray[nRight];
 szArray[nRight] = szSwap;
 return nRight;
}

Now see the binary search method implementation that we did:

int BinarySearch(char** szArray, char key[], int nLower, int nUpper)
{
 // Termination case
 if(nLower > nUpper)
   return 0;

 int middle = (nLower + nUpper) / 2;

 if(strcmp(szArray[middle], key) == 0)
   return middle;
 else
 {
   if(strcmp(szArray[middle], key) > 0)
     // Search left
     return BinarySearch(szArray, key, nLower, middle - 1);
   // Search right
   return BinarySearch(szArray, key, middle + 1, nUpper);
 }
}

The next ones are the method prototypes and the main entry point that calls a menu. According to the user passed parameters we call the quicksort and the binary search methods:

// Function prototypes
void Menu(void);
void QuickSort(char** szArray, int nLower, int nUpper);
int Partition(char** szArray, int nLower, int nUpper);
int BinarySearch(char** szArray, char key[], int nLower, int nUpper);

// Application initialization
void main(void)
{
 char op;

 do
 {
   Menu();
   printf("\n\nDo you wanna a new QuickSort? Y/N");
   op = getche();
   if(islower(op))
     op = toupper(op);
 }
 while(op == 'Y');
}

void Menu(void)
{
 // Clear screen
 system("CLS");

 // Control execution time
 clock_t initial, final;

 // Print startup banner
 printf("\nQuickSort C++ Sample Application\n");
 printf("Copyright (c)2001-2002 Microsoft Corporation. All rights reserved.\n\n");
 printf("MSDN ACADEMIC ALLIANCE [http://www.msdnaa.net/]\n\n");
 printf("BinarySearch C++ Sample Application\n");
 printf("Copyright (c)2005 Leniel Braz de Oliveira Macaferi & Wellington Magalhaes Leite.\n");
 printf("UBM COMPUTER ENGINEERING - 5TH SEMESTER [http://www.ubm.br/]\n\n");

 // Describe program function
 printf("This program example demonstrates the QuickSort and BinarySearch algorithms by\n");
 printf("reading an input file, sorting its contents, writing them to a new file and\n");
 printf("searching on them.\n\n");

 // Prompt user for filenames
 char szSrcFile[1024], szDestFile[1024];
 printf("Source: ");
 gets(szSrcFile);
 printf("Output: ");
 gets(szDestFile);

 // Read contents of source file
 const long nGrow = 8;
 long nAlloc = nGrow;
 long nSize = 0;
 char** szContents = new char* [nAlloc];
 char szSrcLine[1024];
 FILE* pStream = fopen(szSrcFile, "rt");

 while(fgets(szSrcLine, 1024, pStream))
 {
   // Trim newline character
   char* pszCheck = szSrcLine;
   while(*pszCheck != '\0')
   {
     if(*pszCheck == '\n' && *(pszCheck + 1) == '\0')
       *pszCheck = '\0';
     pszCheck++;
   }

   // Append to array
   szContents[nSize] = new char [strlen(szSrcLine) + 1];
   strcpy(szContents[nSize], szSrcLine);
   nSize = nSize + 1;

   if(nSize % nGrow == 0)
   {
     // Resize the array
     char** szPrev = szContents;
     nAlloc += nGrow;
     szContents = new char* [nAlloc];
     memcpy(szContents, szPrev, nSize * sizeof(char*));
     delete szPrev;
   }
 }
 fclose (pStream);

 initial = clock();

 // Pass to QuickSort function
 QuickSort(szContents, 0, nSize - 1);

 final = clock();

 // Write sorted lines
 pStream = fopen (szDestFile, "wt");
 for(int nIndex = 0; nIndex < nSize; nIndex++)
 {
   // Write line to output file
   fprintf (pStream, "%s\n", szContents[nIndex]);
 }
 fclose (pStream);

 // Report program success
 printf("\nThe sorted lines have been written to the output file.\n\n");

 // QuickSort execution time
 double duration = (double)(final - initial) / CLOCKS_PER_SEC;

 printf("The QuickSort execution time was: %2.9lf s = %.0lf ms = %.0lf \xE6s\n\n", duration, duration * 1000, duration * 1000000);

 char op = '\0';

 do
 {
   printf("Do you wanna a BinarySearch to locate a specific key? Y/N");

   op = getche();
   if(islower(op))
     op = toupper(op);
   if(op == 'Y')
   {
     printf("\n\nType the key you want to search for: ");
     char key[1024];
     gets(key);

     initial = clock();

     if(BinarySearch(szContents, key, 0, nSize - 1))
     {
       final = clock();

       duration = (double)(final - initial) / CLOCKS_PER_SEC;

       printf("\nKey found!\n\n");

       printf("The BinarySearch execution time was: %2.9lf s = %.0lf ms = %.0lf \xE6s\n\n", duration, duration * 1000, duration * 1000000);

     }
     else
     {
       final = clock();

       duration = (double)(final - initial) / CLOCKS_PER_SEC;

       printf("\nKey not found!\n\n");

       printf("The BinarySearch execution time was: %2.9lf s = %.0lf ms = %.0lf \xE6s\n\n", duration, duration * 1000, duration * 1000000);

     }
   }
   else
   {
     // Deallocate entire array
     for(int nIndex = 0; nIndex < nSize; nIndex++)
       // Delete current array element
       delete szContents[nIndex];

     delete szContents;
     szContents = NULL;
   }
 }
 while(op == 'Y'); 
}

Visual Studio C++ Console Application

You can get the project files at:
http://leniel.googlepages.com/QuicksortBinarySearchCPlusPlus.zip

Random number generator

You can get the spreadsheet responsible for this task at:
http://leniel.googlepages.com/QuicksortBinarySearchRandomNumGen.xls

How to use it?

To use the program:
  1. Enter the name of a file that contains unsorted data;
  2. Use the sample files included in the .ZIP package as: 1000.txt and 2500.txt;
  3. In the command line "Source", type: 1000.txt;
  4. In the command line "Output", type a name to the file that will be sorted. e.g.: sorted.txt;
  5. After the sorting process, choose if you want or not to execute a Binary Search. If yes, provide a value to be searched. If not, choose if it is or not desired to execute a new Quicksort.

Postscript:
- To generate random numbers, use the file Random numbers generator.xls file;
- The file QuicksortBinarySearch.cpp contains the source code. The same can be used freely. Mention the authors.

Efficiency comparison

For the sake of comparison I've run some test cases with different input files. See the result in the table that follows:

Quicksort and Binary search performance
n File name File size (bytes) Timing (milliseconds)
Quicksort Binary search
10000 10000.txt 122.880 16 0
25000 25000.txt 200.704 78 0
50000 50000.txt 401.408 219 0
75000 75000.txt 602.112 360 0
100000 100000.txt 802.816 516 0

It's important to note that the time the quicksort takes appears to be longer but it is not. Why? Because the the program needs to read the file content and write the sorted data back to the output file so that it appears to take longer than the milliseconds shown on the above table. The timing functions just operate while the quicksort is running.

For the the binary search key I've input a value localized in the beginning of the sorted file, in the middle and in the end. There was no time changes. The binary search found the key I entered with a time less than (0 µs - microsecond). I have an AMD Athlon XP 2400 with 512 MB RAM.

See a screenshot of the last test case:

QuicksortBinarySearchCPlusPlusTestCase

The paper

You can get a copy of the paper in the .PDF format at:
http://leniel.googlepages.com/QuicksortAndBinarySearchAlgorithms.pdf

Win32 API Mouse interaction

Windows API
The Windows API, informally WinAPI, is Microsoft's core set of application programming interfaces (APIs) available in the Microsoft Windows operating systems. All Windows programs except console programs must interact with the Windows API regardless of the language.

Win32 API
The Win32 API is the 32-bit API for modern versions of Windows. The API consists of functions implemented, as with Win16, in system DLLs. The core DLLs of Win32 are kernel32.dll, user32.dll, and gdi32.dll. Win32 was introduced with Windows NT.

Words of wisdom
Dealing with the Win32 API appears to be a regression since it takes us to the last century, that is, when programming with such API we are writing code that resembles the code written 15 years ago or more. Regression was the feeling I felt when the teacher said we'd study this API. Nevertheless, there are billions of lines of code that need maintenance because great part of these lines are used in legacy systems. So you see that learning this API is fundamental even today. This is in contrast with the mainframe computers dilemma. Even today there are a bunch of companies that still use them because of legacy systems. The feeling of regression was substituted by a enthusiasm one in the end.

Mouse interaction app
One coursework related to the Object Oriented Systems discipline I had to develop was a program that draws on the screen by free hand with the mouse assistance. The program must monitor the mouse movement, indicating its position (x, y) in the upright corner of the application window, in the format (xxx, yyy). Pressing the left mouse button it draws and pressing the right mouse button, it wipes off the screen content.

Some valuable tips that the teacher gave:
Use the events WM_MOUSE, WM_LBUTTONDOWN, WM_LBUTTONUP, WM_RBUTTONDOWN, WM_RBUTTONUP and the function SetPixel()

lword(lParam) has 0 x
hiword(lParam) has 0 Y

I originally implemented this program using DevC++. Today while composing this post I just created a new Win32 Project application using the Microsoft Visual Studio C++ Express Edition. Although the implementation differs a little bit, it wasn't difficult to port it to the Microsoft programming environment. I'll provide both implementations at the end of the post.

Let's see the mouse app main block of code. It's inside the WndProc function that is responsible for processing the messages for the main window. The code is commented so I think it's unnecessary to add more to it.

//
// FUNCTION: WndProc(HWND, UINT, WPARAM, LPARAM)
//
// PURPOSE: Processes messages for the main window.
//
// WM_COMMAND - Process the application menu
// WM_PAINT - Paint the main window
// WM_DESTROY - Post a quit message and return
// WM_LBUTTONDOWN - Left mouse button clicked
// WM_RBUTTONDOWN - Right mouse button clicked
// WM_LBUTTONUP - Left mouse button released
// WM_RBUTTONUP - Right mouse button released
// WM_MOUSEMOVE - Controls the mouse movement
//
LRESULT CALLBACK WndProc(HWND hWnd, UINT message, WPARAM wParam, LPARAM lParam)
{
int wmId, wmEvent;

switch(message) //Handle the messages
{
case WM_DESTROY:

PostQuitMessage(0); // Send a WM_QUIT to the message queue

break;

case WM_PAINT:
// TODO: Add any drawing code here...
hDC = GetDC(hWnd);

BeginPaint(hWnd, &paintStruct);

EndPaint(hWnd, &paintStruct);

break;

// Left button event used to print the screen
case WM_LBUTTONDOWN:

flag = true;

// Black color
newColor = RGB(0, 0, 0);

xMouse = LOWORD(lParam);

yMouse = HIWORD(lParam);

SetPixel(hDC, xMouse, yMouse, newColor);

break;

// Right button event used to erase the screen content
case WM_RBUTTONDOWN:

flag = true;

// White color
newColor = RGB(255, 255, 255);

xMouse = LOWORD(lParam);

yMouse = HIWORD(lParam);

SetPixel(hDC, xMouse, yMouse, newColor);

break;

// Sets the flag to false so that we know the left mouse button was released
case WM_LBUTTONUP:

flag = false;

break;

// Sets the flag to false so that we know the right mouse button was released
case WM_RBUTTONUP:

flag = false;

break;

// Controls the mouse movement and shows its current position on the Window title
case WM_MOUSEMOVE:

if(flag)
{
xMouse = LOWORD(lParam);

yMouse = HIWORD(lParam);

SetPixel(hDC, xMouse, yMouse, newColor);

sprintf_s(strTitle, " x=%d y=%d", xMouse, yMouse);

SetWindowText(hWnd, strTitle);

SetWindowText(hlabel, strTitle);
}
else
{
xMouse = LOWORD(lParam);

yMouse = HIWORD(lParam);

sprintf_s(strTitle, " x=%d y=%d", xMouse, yMouse);

SetWindowText(hWnd, strTitle);

SetWindowText(hlabel, strTitle);
}

break;

case WM_COMMAND:
wmId = LOWORD(wParam);
wmEvent = HIWORD(wParam);
// Parse the menu selections:
switch(wmId)
{
case IDM_ABOUT:

DialogBox(hInst, MAKEINTRESOURCE(IDD_ABOUTBOX), hWnd, About);

break;

case IDM_EXIT:

DestroyWindow(hWnd);

break;

default:

return DefWindowProc(hWnd, message, wParam, lParam);
}

break;

default: // For messages that we don't deal with

return DefWindowProc (hWnd, message, wParam, lParam);
}

return 0;
}

Reference
To get more insight regarding the Win32 API, go to the Win32 Development site at the Microsoft Development Network: http://msdn.microsoft.com/en-us/library/aa139672.aspx

Dev-C++ and Visual Studio Projects
DevC++ project
http://leniel.googlepages.com/Win32APIMouseInteractAppDevCPlusPlus.zip

Visual Studio Win32 project
http://leniel.googlepages.com/Win32APIMouseInteractAppVCPlusPlus.zip

Source code Indenter-Capitalizer

Following the coursework related to the Compilers Construction discipline I attended during the Computer Engineering course, I was asked to indent and capitalize the reserved words (keywords) of a source code file. More specifically, to do this work I should use the Syntactic Analyzer built with Flex and YACC that was created in a previous coursework task.

The source code file in question is the one being analyzed by the syntactic analyzer. This way at the same time it analyses the syntactic structure of the file it also indents and capitalizes the keywords.

The following is the content of the syntactic analyzer file named sinan.yacc:

%{
   #include <stdio.h>
   #include <stdlib.h>

   int c;
%}

%token PROGRAM
%token ID
%token SEMIC
%token DOT
%token VAR
%token COLON
%token INTEGER
%token REAL
%token RCONS
%token BOOL
%token OCBRA
%token CCBRA
%token IF
%token THEN
%token ELSE
%token WHILE
%token DO
%token READ
%token OPAR
%token CPAR
%token WRITE
%token COMMA
%token STRING
%token ATRIB
%token RELOP
%token ADDOP
%token MULTOP
%token NEGOP
%token CONS
%token TRUE
%token FALSE

%%

prog : PROGRAM {printf("PROGRAM "); c = 0;}

       ID {printf("%s", yytext);}

       SEMIC {printf(";\n\n");}

       decls

       compcmd

       DOT {printf(".");}
       {
         printf("\n Syntactical Analisys done without erros!\n");

         return 0;
       }
     ;

decls : VAR {printf("VAR ");} decl_list

      ;

decl_list : decl_list decl_type
            decl_type
          ;

decl_type : id_list COLON {printf(":");} type SEMIC {printf(";\n");}
          ;

id_list : id_list COMMA {printf(", ");} ID {printf("%s", yytext);}
          ID {printf("%s", yytext);}
        ;

type : INTEGER {printf(" INTEGER");}
       REAL {printf(" REAL");}
       BOOL {printf(" BOOL");}
     ;

compcmd : OCBRA {int i; for(i = 0; i < c; i++)printf(" "); printf("{\n"); c = c + 2;} cmd_list CCBRA {printf("\n"); int i; for(i = 2; i < c; i++)printf(" "); printf("}"); c = c - 2;}
        ;

cmd_list : cmd_list SEMIC {printf(";\n\n");} cmd
           cmd
         ;

cmd : {int i; for(i = 0; i < c; i++)printf(" ");} If_cmd
      {int i; for(i = 0; i < c; i++)printf(" ");} While_cmd
      {int i; for(i = 0; i < c; i++)printf(" ");} Read_cmd
      {int i; for(i = 0; i < c; i++)printf(" ");} Write_cmd
      {int i; for(i = 0; i < c; i++)printf(" ");} Atrib_cmd
      compcmd
    ;

If_cmd : IF {printf("IF ");} exp THEN {printf(" THEN\n");} cmd Else_cmd
       ;

Else_cmd : ELSE {printf("\n"); int i; for(i = 0; i < c; i++)printf(" "); printf("ELSE\n"); c = c + 2;} cmd {c = c - 2;}

         ;

While_cmd : WHILE {printf("WHILE ");} exp DO {printf(" DO\n");} cmd
          ;

Read_cmd : READ {printf("READ");} OPAR {printf("(");} id_list CPAR {printf(")");}
         ;

Write_cmd : WRITE {printf("WRITE");} OPAR {printf("(");} w_list CPAR {printf(")");}
          ;

w_list : w_list COMMA {printf(", ");} w_elem
         w_elem
       ;
w_elem : exp
         STRING {printf("%s", yytext);}
       ;

Atrib_cmd : ID {printf("%s ", yytext);} ATRIB {printf("= ");} exp
          ;

exp : simple_exp
      simple_exp RELOP {printf(" %s ", yytext);} simple_exp
    ;

simple_exp : simple_exp ADDOP {printf(" %s ", yytext);} term
             term
           ;

term : term MULTOP {printf(" %s ", yytext);} fac
       fac
     ;

fac : fac NEGOP {printf(" %s", yytext);}
      CONS {printf(" %s", yytext);}
      RCONS {printf(" %s", yytext);}
      OPAR exp CPAR
      TRUE {printf("TRUE ");}
      FALSE {printf("FALSE ");}
      ID {printf("%s", yytext);}
    ;


%%

#include "lex.yy.c"

As you can see there is an integer variable named c right beneath the #include section. This variable is incremented and decremented according to the section of code being analyzed (parsed). Such variable is then used inside the for loops. According to its current value blank spaces are written on the screen so that the next token parsed is printed on the right position (indented).

Each keyword parsed is capitalized and printed on the screen through a simple printf command.

Let's run a simple test case with the following source code file named test.txt. The code is intentionally not indented and it doesn't do much. It's just for a testing purpose.

program factorial;

var n, fact, i: integer;
{
read(n);

fact = 1;

i = 1;

while i <= n do
{
fact = fact * i;

i = i + 1
};

if i >= fact then
{
i = fact;

fact = fact + 1
}
else
i = fact + 1;



if i < fact then
{
i = fact;

fact = fact + 1
};

write("The factorial of ", n, " is: ", fact)
}.

In blue are the keywords, so the output should present them in CAPITALIZED letters. The blocks of code should also be presented with indentation according to the logic specified above. I use 2 white spaces to indent the blocks of code. That's why I use c = c + 2 (to increment) and c = c - 2 (to decrement) the c variable is responsible for controlling the indentation.

To run the test case it's necessary to build the syntactic analyzer. I won't show here all the steps involved since it's already done in the paper described in the post Syntactic Analyzer built with Flex and YACC. So I'll just compile the sinan.yacc file since it's the only file that was modified to accomplish this task. The other necessary files to generate the executable file - such as the lexical analyzer ones - are included in the download package at the end of this post.

For a better understanding of the commands used in this post and to set up the development environment I recommend that you read at least section 3.3 of the paper Syntactic Analyzer built with Flex and YACC. That said, let's do the job. :-)

To run this test case, follow the steps listed bellow:

  1. Create a folder called IndentCapCode on the root directory C:\, which will contain all the files created henceforward.
  2. Open a command prompt and type: path=C:\MinGW\bin;%PATH%. I consider that the MinGW installation was done in the folder C:\MinGW. Change it accordingly. After completing this step the tools to build the syntactic analyzer will be available in the command prompt.
  3. Generate the file y.tab.c in the command prompt with following command: yacc sinan.yacc.
  4. Compile the files with GCC in the command prompt with the following command: gcc y.tab.c yyerror.c main.c -osinan -lfl.

The result file for the syntactic analyzer is sinan.exe. To use it just type sinan < test.txt. The file test.txt contains the source code to be analyzed by the syntactic analyzer.

You can see the commands listed above in action through the following screenshot:

In a next post related to compilers construction I'll show the implementation of a semantic analyzer. This implementation was another coursework task. I used the same principle shown here to indent the code and even to colorize the keywords. But this is for another day.

You can get the .ZIP package that contains the files used in this post: http://leniel.googlepages.com/CodeIndenterCapitalizerFlexYACC.zip

Note: if you want, you can use a batch file (.bat) to automate the steps listed above. A .bat file named ini.bat is included in the .ZIP package. For more information about batch files, read my previous post Programming-Constructing a Batch file.

Distance Education and IT-Comm Pros

Distance EducationDistance education or distance learning is a positive manner of tackling global education shortcomings. If we think about the world population we can imagine that not everyone has easy access to the resources we’re so accustomed to. Considering time constraints, distance education reveals itself a great alternative because the time spent to get to a classroom can be a negative factor. Think about people that live outside of big cities. They can't even apply for a course they want because there is no such a course where they live. These barriers lessen education availability and so someone somewhere that doesn’t fit on the above time and place prerequisites has their plans thwarted and can't move forward towards the mainstay “Education”. It’s a really beautiful word. Look at it for an instant and realize what would be of you if you didn’t have the basic of it. To those people to acquire a diploma in some area of knowledge can be considered an unthinkable action.

With the advent of the Internet we have today a classroom right inside our own home. We have the information we want at the exact moment we want and in most part all the information is freely available, that is, we have the opportunity to learn about any subject without having to pay for that end avoiding expenses with specific material that in most cases will be used only once or at most twice to be frank.

Obviously there are prestigious institutions that provide distance education, which confer the most prestigious titles to those that finish a course without at least being present in a classroom. All that with the help of new technological advances that pervade our lives, which in turn make distance education something viable.

Particularly I don’t like the idea of distance education. I prefer to be present at the classroom. It would feel weird taking an educational leadership course in a virtual classroom. I believe that presence education is more enriching. I write this based upon all point of views, being the most important: more active social life interaction.

The virtual classroom world is interesting, yet it’s necessary to heed, for this can bring some undesirable consequences such as the lack of a direct liaison between the teacher/professor and their students and colleagues. Great part of our development is done through social interaction and this is almost foregone when we talk about distance education. There are plenty of other factors that influence my opinion, but for now this demonstrates one of the big bottlenecks that refrains the distance education of evolving in faster paces. There are still conservatives. It can be the case that in a near future such conservatives perish the thought.

The actual distance education environments, such as TelEduc in Brazil enable the distance education practice to reach more and more people. My alma mater UBM constituted a center for distance education called NEAD and uses TelEduc as its environment. I have used it and what I write here has as background my experience as a user of the system.

Tertiary institutions as the case of UBM offer a vast range of improvement courses for their employees and even support courses for students that are undergoing their internship programs and college conclusion projects. The idea appears to be good, but in practice it’s not adopted by everyone, perhaps because of the lack of information regarding the distance education platform’s features and capabilities.

It’s good to highlight that the devices and techniques used to implement distance education are in great part the result of advances pertaining to information and communication technologies. These technologies aggregate people from the more diverse knowledge fields. Information and communication technologies include the set of technological and computational resources set aside to generate and propitiate the use of information. Thus, such technologies are established on the following components: hardware and their peripheral devices, software and their resources, telecommunication systems and data and information management.

The people responsible for the development and management of these components are increasingly requested on the market. Between them are computer engineers, computer scientists, system analysts, business analysts, chief information officers (CIO), chief executive officers (CEO), chief financial officers (CFO) just to name a few.

The instructional level has been growing a lot during the last years, what rises the competitiveness on the job market. A proven fact is that executives have been using distance education to leverage their academic degree, attending courses, such as a Master of Business Administration (MBA) at international universities as MIT, University of Cambridge, University of California, Berkeley, etc. Notwithstanding the executives are on their houses in the comfort of their couches or beds.

I complete this brief analysis of the influence of distance education stating that the use of new technologies like the Internet and well developed distance education environments are making life a lot easier if we take into account the commodity and easy access to information of any given area.

It’s worth to remember about the inherited risks that most of the times pass by unnoticed. The utilization of technology in excess can enslave men, transforming them in slaves of their own inventions. Wherefore we must discuss such subject, aiming at the discovery of a steady point between the virtual and real life, providing a better way for a rich development environment where all people can evolve in a natural manner. I wrote and meant everybody, what presupposes the integration of the underprivileged people into this whole new world of information called Internet. Long live the blogs of life.