Relational Databases for Biologists

Session 1: Data Conceptualization and Database Design

Exercise: Accessing MySQL and viewing table content

Goals: Learn how to access a MySQL database and look at table structure and content. Do this by accessing a database on hebrides.wi.mit.edu, a remote Linux computer, from your laptop.

See /education/bioinfo2006/db4bio/ for the course page.

See the solutions to this exercise to check your work.

To connect to MySQL on hebrides, you have two choices:

a. You can connect directly from your laptop or desktop computer to MySQL on hebrides without logging on to hebrides. To do this, you need MySQL on your computer, which is the case for the class laptops. If you need to do this on your own computer, go to the most recent MySQL download page, select your operating system, download, and install. This will get you the MySQL database management system (DBMS), which has the 'mysql' command to connect you to another computer (or you can create a database on your own computer).

b. You can first log on to hebrides and then connect to MySQL. To do this, you need to have or install a ssh (secure shell) application. See Connecting to tak or hebrides or (if using Windows) install Cygwin as described on Using Cygwin/X for X Windows on Tak or Hebrides.

Note that you can install and use MySQL on your own computer. We have chosen to use MySQL on a Linux computer so everyone in the class can access the same database. If you ever use MySQL on your own computer, the MySQL queries would work just the same (but be perhaps somewhat slower).


Connect to MySQL on hebrides using either step 1a (recommended) or 1b, depending upon if you're doing step (a) or (b) above.

1a. Get to the Unix prompt (Mac OS X) or Unix-like prompt (Windows) on your laptop. If you're using Mac OS X, open the X11 terminal (Applications > Utilities > X11), which is also on the dock. If you're using Windows, open Cygwin, which is a "Unix emulator", meaning that it makes your Windows computer act as if it's a Unix computer.

In either case, you now have a terminal, essentially a place to send commands to your computer, including the possibility of connnecting to a remote computer.

Change directories ("cd", a Unix command) to go to the desktop:

cd "$USERPROFILE/Desktop"   (for Windows)
cd Desktop   (for Macintosh)

1b. From the terminal on your laptop, connect to Hebrides using the secure shell (ssh) command:

ssh -l username hebrides.wi.mit.edu

and you'll be prompted for your password.

Type pwd ("print working directory") to find your home directory on Hebrides. In the future you may be creating and accessing new directories. If you ever get lost, typing cd ("change directory") will bring you back to your home directory. Another very common Unix command is ls ("list") to list the files in the current directory.

2. To access MySQL on Hebrides, use this command to connect to the db4bio database:

mysql -u username -h hebrides.wi.mit.edu -D database -p
with the following options:

-u username (replace with your MySQL username [not necessarily your hebrides username])

-h host (where the "host" is hebrides.wi.mit.edu; not necessary if you're already connected to hebrides)

-D database (which takes you directly to the database we'll be using; optional)

-p password (for which you will be prompted)

ex:   mysql -u guest99 -h hebrides.wi.mit.edu -D db4bio -p

or if you're already connected to hebrides (by following step (1b) above), use this command:

mysql -u username -D database -p

3. Important notes: if you're logged in to MySQL, you should get the MySQL command prompt:

mysql>
so if you don't see this, you're not logged into the database. Keep in mind that every MySQL command must end with a semicolon. This also means that you can split up a long command into several lines. If you don't get the MySQL command prompt back after you've entered a command, like
mysql> show databases
    ->
there's a good chance you forgot the semicolon. Add it on any subsequent line and your command will be processed.

4. Find out what databases are on the system:

mysql> show databases;
5. To access one of these databases:
mysql> use database;
where "database" is replaced with the name of the one you'd like to access. In this case, enter the database "db4bio".
6. To get a list of tables in this database:
mysql> show tables;
7. To get a description of data fields and types in a table:
mysql> describe table;
where "table" is replaced with the one you'd like to access.
8. To look at some of the data in this table, use a general "SELECT" command. Since these tables are large, you'll want to limit the number of rows (tuples) in your query to 5 or so:
mysql> SELECT * from table LIMIT 5;
where "table" is replaced with the one from which you'd like to select rows.
9. Create a file containing the query and use Unix redirection to save the output of the query into another file.

a. Exit from MySQL:

mysql> quit

b. Create a directory ("dbclass") and move into it:

mkdir dbclass
cd dbclass
with which you "make a directory" (mkdir) and then "change to the directory" (cd).

c. Create a text file called query1.sql using a text editor like TextPad, BBEdit, Word Pad, or even Word. In the file, write a SQL query (ending with a semicolon). Be sure to save the file as plain text in the current directory ("dbclass" on the desktop).

d. Back in the shell, to your MySQL login command, add the SQL file (query1.sql) as input and the file query1.out as output all on one line.

mysql -u username -h hebrides.wi.mit.edu -D db4bio -p < query1.sql > query1.out
Note that you have to specify the database in the command because it's not specified by the query.

e. View the outputfile with a Unix command

more query1.out
or open the file in a text editor.


Questions or comments?   gbell@wi.mit.edu
Bioinformatics and Research Computing at Whitehead Institute