Saturday 27 July 2013

Easy Java Implementation of the Levenshtein distance (Edit Distance) Algorithm

Here I will give a few simple methods to implement the Levenshtein distance (Edit Distance) algorithm. This a good algorithm to implement for beginners in java or rather a programming language in general.
To know more about what this algorithm does and how it works, see my previous post here.

The first method takes the input the two string s to be compared, calls the matrixGenerator method and finally this returns the editDistance value.

int editDistValue(String str1, String str2) {
       
        int editRow = str1.length() + 1; //initialize row
        int editCol = str2.length() + 1; //initialize column

        int[][] editMatrix = new int[editRow][editCol];
        editMatrix = editMatrixGenerator(str1, str2, editMatrix, editRow,
                editCol);

        int editDistance = 999;
        editDistance = editMatrix[editRow - 1][editCol - 1];

        return editDistance;
    }

the editMatrix Generator creates the matrix using the algorithm and generates the matrix.

int[][] editMatrixGenerator(String str1, String str2,
            int[][] editMatrix, int editRow, int editCol) {
       
        int i, j;

        char[] rowString = str1.toCharArray();
        char[] colString = str2.toCharArray();

        for (i = 0; i < editRow; i++) {
         for (j = 0; j < editCol; j++) {
           if (i == 0) {
             editMatrix[0][j] = j;
            }
           if (j == 0) {
             editMatrix[i][0] = i;
            }
           if (i != 0 && j != 0) {
             if (rowString[i - 1] == colString[j - 1]) {
              editMatrix[i][j] = editMatrix[i - 1][j - 1];
             } else {
              editMatrix[i][j] = (Math.min(editMatrix[i - 1][j - 1],
                                   Math.min(editMatrix[i][j - 1],
                                   editMatrix[i - 1][j]))) + 1;
                    }
                }
            }
        }
        return editMatrix;
    }

Finally to get a proper confidence value we send the editDistance value obtained, along with the input to the edSimilarityCal method to get a value between [0,1].

float edSimilarityCal(String string1, String string2, int editDistance) {
       
        float maxLen=string1.length();
        if (maxLen < string2.length())
        maxLen = string2.length();
        if (maxLen == 0.0F)
        return 1.0F;
        else
        return 1.0F - editDistance/maxLen;  
    }


Well that's about it.
Cheers!

The Levenshtein Distance Algorithm (Edit Distance)

Ever wondered how a spellchecker works? or how the song matching app on your phone works?
Well, it is not as complicated as you might think. Its pretty much a variation in the Levenshtein Distance algorithm or the Edit Distance algorithm, named after Vladimir Levenshtein.

It works like this: To tell the distance or rather the levenshtein distance between two words, all you have to do is find out how many operations are required to convert one word to another.
For example, the distance between "running" and "cunning" would be just one, as all you have to do is replace 'r' with 'c'. Similarly the distance between "roast" and "rest" would be two, one operation is removal of one of the characters 'o' or 'a' and replacing the other with 'e'. So how do we implement  this?

The Algorithm goes like this:
Given the two words(strings in this case), we calculate the number of characters in both and create a matrix having rows and columns as the sizes of the words+1. It does not matter which word is the row and which is the column. The words themselves are not in the matrix, its just a matrix of integers.
So in the example of "rest" and "roast" the matrix would be of size 5x6(or 6x5) and would look like this:

Now, lets represent the position in the matrix as a[i,j]. i-->row and j-->column.
The first element a[0,0] is always initiated as 0. to get any other element say a[i,j] we do the following.
We first compare the two characters, if they are equal then we simply bring down the value from the diagonal element i.e. a[i-1,j-1] and place it in a[i,j].
Else we see minValue = minimum( a[i-1,j] , a[i,j-1], a[i-1,j-1]) and put this minValue+1 in a[i,j].
And follow the same steps repeatedly till all the cells in the matrix are filled.

Now that we have built the matrix, here is the analysis we can make from it:
-- The last element in the matrix gives us the distance between the words.
-- Any intermediate cell gives us number of operations to convert the characters traversed in the row to the characters in the column.
-- The first row a[i,0] and the first column a[0,j] would be incremental as that's the number of operations it would take to convert a null string to the characters traversed.

The final matrix would look like this:

So a spell checker would generally use this distance to suggest the user what he might have been actually looking for by comparing the input with the database and sort the edit distance and based on a certain threshold set would give a suggestion. Similarly a song matching app would convert the recorded audio (after some signal processing) to a string which would be matched with the strings with the database, to calculate this distance and the one with the least value would most likely be the one.
Well, then again this is just a very abstract way putting things, the actual implementation is much more complicated but kinda works on the same principles.

For a simple java implementation click here.
PS: What we just did was "Dynamic Programming". =D
Cheers!

Sunday 21 April 2013

EvalVid on NS-3 on Ubuntu 12.04

Using EvalVid on ns-3 is pretty much the same as running it on ns-2 but no tcl script is required fo create the network topology, its all in the .cc code.

Download the EvalVid Module from GERCOM here.
Author: Billy Pinheiro <haquiticos@gmail.com>

Now the steps are quite similar to my previous post till the trace file is generated.

1. First we need to download a yuv sequence which has to be converted to a m4v file. This can be done using  'ffmpeg' It will create a compressed raw video. For example download highway_cif.yuv which is available online.
 
$ffmpeg -i highway_cif.yuv highway_cif.m4v
 
2. Next we need to get the .mp4 file. Following command lines create ISO MP4 files containing the video samples (frames) and a hint track which describes how to packetize the frames for the transport with RTP.
 
$./MP4Box -hint -mtu 1024 -fps 30 -add highway_cif.m4v highway_cif.mp4
 
3. Now that we have the mp4 file, we need to generate the trace file required to sent into the network. The mp4trace tool from EvalVid is able to send a hinted mp4-file per RTP/UDP to a specified destination host. The output of mp4trace will be needed later, so it should be redirected to a file. The trace file thus generated is st_a01.st
 
$./mp4trace -f -s 192.168.0.2 12346 highway_cif.mp4 > st_highway_cif.st
 
4. Unpack the evalvid.zip file inside the folder  ns-3.13/src/

5. Move the highway_cif.mp4 and st_highway_cif.st files to ns-3.13/

6. ./waf configure --enable-examples

7. execute ./waf build

8. execute ./waf --run "evalvid-client-server"

9. The two files sd_a01 and rd_a01 are generated. The next step is the reconstruction of the transmitted video as it is seen by the receiver. For this, the video and trace files are processed by etmp4 (Evaluate Traces of MP4-file transmission): 
 
$./etmp4 -f -0 sd_a01 rd_a01 st_highway_cif.st highway_cif.mp4 highway_out

This generates a possibly corrupt vido file in which the lost frames are deleted.
 
10. decode into the .yuv format and then calculate the psnr
 
$ffmpeg -i highway_out.mp4 highway_out.yuv
$./psnr 352 288 420 highway_cif.yuv highway_out.yuv
 
Done!
 
References :
 
Research in field done by:
Mr. Abhishek Thakur and team,
Department of CS & IS
BITS-Pilani, Hyderabad Campus,
India. 
 

EvalVid on NS-2 on Ubuntu 12.04

In my previous posts I have given steps to install NS-2 on Ubuntu 12.04 and also given a few details on EvalVid. Now Here's how to use EvalVid on ns-2 to evaluate video over a simulated network.

Before using the tools to evaluate a video a few modifications must be done to a few files in ns-2 for integrating EvalVid into it. Use the steps provided by the following paper to do these modifications and insert the module into ns-2.

Chih-Heng Ke, Ce-Kuen Shieh, Wen-Shyang Hwang, Artur Ziviani, “An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission”,
Journal of Information Science and Engineering (accepted) (SCI, EI).

Once these steps are done we need to create the network topology to do the simulation.
This network is created using a tcl script. The tcl script has the number of nodes, the link details between the nodes, etc.

1. Next we need to download a yuv sequence which has to be converted to a m4v file. This can be done using  'ffmpeg' It will create a compressed raw video. For example download akiyo_cif.yuv which is available online.
 
$ffmpeg -i akiyo_cif.yuv a01.m4v
 
2. Next we need to get the .mp4 file. Following command lines create ISO MP4 files containing the video samples (frames) and a hint track which describes how to packetize the frames for the transport with RTP.
 
$./MP4Box -hint -mtu 1024 -fps 30 -add a01.m4v a01.mp4
 
3. Now that we have the mp4 file, we need to generate the trace file required to sent into the network. The mp4trace tool from EvalVid is able to send a hinted mp4-file per RTP/UDP to a specified destination host. The output of mp4trace will be needed later, so it should be redirected to a file. The trace file thus generated is st_a01.st
 
$./mp4trace -f -s 192.168.0.2 12346 a01.mp4 > st_a01.st
 
4. For example use the be_a01.tcl from the link mentioned above to create the network and run this tcl file.
 
$./ns2 be_a01.tcl 

5. This now runs the tcl script thus sending the trace file into the network and create two files
sd_a01 and rd_a01. The file sd_a01 is to record the sending time of each packet while the file rd_a01 is used to record the received time of each packet.
 
6.  The next step is the reconstruction of the transmitted video as it is seen by the receiver. For this, the video and trace files are processed by etmp4 (Evaluate Traces of MP4-file transmission):
 
$./etmp4 -f -0 sd_a01 rd_a01 st_a01.st a01.mp4 a01out

This generates a possibly corrupt vido file in which the lost frames are deleted.

7. decode into the .yuv format and then calculate the psnr

$ffmpeg -i a01out.mp4 a01out.yuv
$./psnr 352 288 420 akiyo_cif.yuv a01out.yuv
 
Done!
References: 
http://140.116.164.80/~smallko/ns2/Evalvid_in_NS2.htm
http://140.116.164.80/~smallko/ns2/myevalvid2.htm
http://www.isi.edu/nsnam/ns/

Research in field done by:
Mr. Abhishek Thakur and team,
Department of CS & IS
BITS-Pilani, Hyderabad Campus,
India.

Installing NS-3 on Ubuntu 12.04

ns-3 is a discrete-event network simulator for Internet systems, targeted primarily for research and educational use. ns-3 is free software, licensed under the GNU GPLv2 license, and is publicly available for research, development, and use.

It has a very strong documentation and does not use .tcl script to create the network topology unlike its predecessor ns-2 which makes it simpler to deal with.

Further details on ns-3 can be found here : http://www.nsnam.org/
The detailed installation steps can be found here.

1. Open a terminal and download the following packages:

sudo apt-get install gcc g++ python              
sudo apt-get install gcc g++ python python-dev
sudo apt-get install mercurial
sudo apt-get install bzr
sudo apt-get install gdb valgrind
sudo apt-get install gsl-bin libgsl0-dev libgsl0ldbl
sudo apt-get install flex bison libfl-dev
sudo apt-get install g++-3.4 gcc-3.4
sudo apt-get install tcpdump
sudo apt-get install sqlite sqlite3 libsqlite3-dev
sudo apt-get install libxml2 libxml2-dev
sudo apt-get install libgtk2.0-0 libgtk2.0-dev
sudo apt-get install vtun lxc
sudo apt-get install uncrustify
sudo apt-get install doxygen graphviz imagemagick 
sudo apt-get install texlive texlive-extra-utils texlive-latex-extra
sudo apt-get install python-sphinx dia
sudo apt-get install python-pygraphviz python-kiwi python-pygoocanvas libgoocanvas-dev
sudo apt-get install libboost-signals-dev libboost-filesystem-dev
sudo apt-get install openmpi*
 
 
2. Next download the .tar file and untar it.

mkdir ns3
cd ns3
wget http://www.nsnam.org/release/ns-allinone-3.13.tar.bz2
tar xjf ns-allinone-3.13.tar.bz2

3. Build the ns-3and configure with waf

./build.py
./waf distclean
./waf configure --enable-examples --enable-tests
./waf build

4. Now test

./test.py

5. To run a file it must be in the /scratch folder (example ../scratch/test.cc)

./waf -run scratch/test

Done!

 

EvalVid and Usage in a network simulator

EvalVid, written by Jirka Klaue is a framework and tool-set for evaluation of the quality of video transmitted over a real or simulated communication network. It is targeted for researchers who want to evaluate their network designs or setups in terms of user perceived video quality.

Besides measuring QoS parameters of the underlying network, like loss rates, delays, and jitter, standard video quality metrics like PSNR and SSIM and a subjective video quality evaluation metric of the received video are provided.

 Currently H.264, MPEG-4 and H.263 are supported. AAC support is also included, though the perceptual quality evaluation has to be done by external tools implementing dedicated metrics like PESQ or PEAQ.

EvalVid can be downloaded for the linux platform from http://www2.tkn.tu-berlin.de/research/evalvid/

EvalVid-2.7 for linux contains the following binary files:
eg
etmp4
hist
miv
mos
mp4trace
psnr
vsgen

We also require ffmpeg and MP4Box to be downloaded separately.
mp4box
ffmpeg

the whole package can be used to evaluate media over a simulated network using NS-2 or NS-3 in the following way:

 


To run it successfully we need the ffmpeg, mp4box, mp4trace, etmp4 and psnr.

ffmpeg:  ffmpeg is a very fast video and audio converter that can also grab from a live audio/video source. It can also convert between arbitrary sample rates and resize video on the fly with a high quality polyphase filter. Used to convert the .yuv format to a .m4v format.

mp4box: MP4Box  is  a multi-purpose command line tool to create and edit MPEG-4 Systems presentations and manipulate ISO-media files (MP4, 3GP, MOV). converts .m4v to a .mp4 file.

mp4trace: mp4trace generates the trace file which is got using the details like packetmode or frame mode or  protocol type, delay, etc. This trace file is sent over the simulated network.

etmp4: This is used to generate the received video file which would be probably corrupted where all the frames that were corrupted or lost would be deleted.

psnr: This calculates the peak signal to noise ratio between the two files (sent and received). Typical values for the PSNR in lossy image and video compression are between 30 and 50 dB, where higher is better. Acceptable values for wireless transmission quality loss are considered to be about 20 dB to 25 dB

References:
http://www2.tkn.tu-berlin.de/research/evalvid/
http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio

Research in field done by:
Mr. Abhishek Thakur and team,
Department of CS & IS
BITS-Pilani, Hyderabad Campus,
India.




Saturday 19 January 2013

IInstalling NS2 on Ubuntu 12.04

Step-1: Open the terminal and install the following packages.

sudo apt-get install gcc g++ python             
sudo apt-get install gcc g++ python python-dev
sudo apt-get install mercurial
sudo apt-get install bzr
sudo apt-get install gdb valgrind
sudo apt-get install gsl-bin libgsl0-dev libgsl0ldbl
sudo apt-get install flex bison libfl-dev
sudo apt-get install g++-3.4 gcc-3.4

sudo apt-get install tcpdump
sudo apt-get install sqlite sqlite3 libsqlite3-dev
sudo apt-get install libxml2 libxml2-dev
sudo apt-get install libgtk2.0-0 libgtk2.0-dev
sudo apt-get install vtun lxc
sudo apt-get install uncrustify
sudo apt-get install doxygen graphviz imagemagick 

sudo apt-get install texlive texlive-extra-utils texlive-latex-extra
sudo apt-get install python-sphinx dia
sudo apt-get install python-pygraphviz python-kiwi python-pygoocanvas libgoocanvas-dev
sudo apt-get install libboost-signals-dev libboost-filesystem-dev
sudo apt-get install openmpi*


Step-2: Download the all-in-one tar.gz file from http://www.isi.edu/nsnam/ns/

Step-3: untar it using "tar zxvf ns-allinone-xxx.tar.gz"
Once all files are extracted enter into the directory through the terminal and run "./install". Once installed the terminal will look something like this and will ask you to add some directoties to your PATH environment.

 Step-4: PATH is an enviroment variable. Basically, it tells your machine were to search for programs.
Open a new terminal and edit the following file from your home directory using:
"gedit ~/.profile"
find the following line: PATH="$HOME/bin:$PATH"
and add a colon and paste the given directory.... something like this - PATH="$HOME/bin:$PATH://home/kartheek/Desktop/Cop/ns-allinone-2.35/bin".

Step-5:  Add two more environment variables as shown in the terminal for LD_LIBRARY and TCL_LIBRARY as follows:

export LD_LIBRARY_PATH="/home/kartheek/Desktop/Cop/ns-allinone-2.35/otcl-1.14:/home/kartheek/Desktop/Cop/ns-allinone-2.35/lib"
 

export TCL_LIBRARY_PATH="/home/kartheek/Desktop/Cop/ns-allinone-2.35/tcl8.5.10/library"

Step-6: Now NS2 has been installed and configured.... to validate it do the following:
cd ns-2.35; ./validate (may take a really long time!!!)

One last step....to start using the network simulator install the following:
sudo apt-get install ns2 nam xgraph