DAY30
Thursday, 19 July 2012


Today is a LAB day.

We have a LAB #9 to be done during class time, for credit. When you have completed the lab, please let Detelina know so that she can approve and record your work.


LAB#9:

Task 1: Copy the file "major_general.txt" from the list of files at the bottom of this page. This is a song from "The Pirates of Penzance". We will need this file for the lab exercises.

Task 2: Create a main program which will read a text one character at a time. You may recall from yesterday's lecture that such a program can use the function getchar(). You probably forgot that, in order to be able to warn you about the end of input, getchar() returns an integer, not a character. If this integer equals the special value EOF, then the text has terminated and you should break from the loop that is reading. Otherwise, you can copy the integer into a character and work with it.

(You might want to compare your program to the "type.c" program I discussed on Wednesday. A copy is available at the bottom of this page.)

An outline of the heart of your program would be:

      LOOP ( forever )
        check = getchar ( )
        if check equals EOF then break
        c = check
      

We want to count the number of characters that were read by the program. So create a counter variable, initialize it to zero, increment it each time a character is successfully read, and print the value after you exit the loop.

      LOOP ( forever )
        check = getchar ( )
        if check equals EOF then break
        c = check
        increment the counter

      PRINT the counter
      

To see if your program is working, run your program and type the following string at the terminal. When you get to the end of the second line, hit return, then type CTRL-D, that is, hold down the control key and push D.

         ./a.out
         I am warning you,
         Soylent Green is people!
      
If you typed exactly what I show here, your count should be 43 characters. Repeat the experiment, but now just the first character of each line:
         ./a.out
         I
         S
      
If you typed exactly what I show here, your count should be 4 characters. What are the two "invisible" characters?

Task 3: Now, instead of typing the input to the program, "feed" the program the file "major_general.txt". How many characters does your program find in this file?

Before beginning the next task, turn off the statement that prints the counter.

Task 4: Here's a harder question: how many characters of each kind are there in the file? That is, how many times does "a" occur, and "b" and so on? We know there are theoretically 256 possible characters, but many of them are not used in text. In fact, only the characters 32 (space) through 126 (~) are printable. Let's try to see how often these files occur.

Create an integer array called count[] that can store 256 items. Initialize count to 0.

In your program, once you have successfully read the character c, increment the corresponding entry in count. Since the character c can also be used as a number, we want to increment the "c'th" entry of count.

We want to count the number of characters that were read by the program. So create a counter variable, initialize it to zero, increment it each time a character is successfully read, and print the value after you exit the loop.

      INITIALIZE the count array to 0

      LOOP ( forever )
        check = getchar ( )
        if check equals EOF then break
        c = check
        increment count[c]
      

Once your program has read the text, use a loop to print the entries of count. However:

      INITIALIZE the count array to 0

      LOOP ( forever )
        check = getchar ( )
        if check equals EOF then break
        c = check
        increment count[c]

      LOOP from c = 32 to 127
        PRINT c (as int), c (as char), count[c]
      

Run your program and examine the output. Does it look reasonable to you?

Task 5: Run your program, feeding it the file "major_general.txt" and saving the output as "histo.txt". This means you have to use two redirection symbols:

        ./a.out < major_general.txt > histo.txt
      

Ask gnuplot to plot your data as a histogram, using column 1 for X, column 3 for Y, and column 2 for labels:

        gnuplot
          plot "histo.txt" using 1:3:xtic(2) with boxes
      
Show Detelina your plot when you have complete this task.


HOMEWORK #9 (must be turned in by next Thursday):

Homework 9.1: Write a program that reads text and prints a copy in which every occurrence of the uppercase or lowercase letter "s" is replaced by a dollar sign "$". This should be similar to the type.c program. Demonstrate your program by applying it to the major-general file.

Homework 9.2: Write a program which reads text and reports the number of times the word "and" occurs. The words "And" and "and," and "hand" and "andy" do NOT count. We are only looking for the letters "and" separated from their neighbors by whitespace. That's what C thinks a word is. Look at the "read_words.c" example from Day 29 (see below) to see how to read words one at a time, and at the function "equalstrings.c" from Day 28 (see below) to see one way to ask whether two strings are equal. Demonstrate your program by applying it to the major-general file.

Homework 9.3: (Graduate students only!). What is the longest line in the major-general file? Answer this question by reading one line at a time from the file. You can use the stringlength() function (see below) to measure its length, unless you want to write your own function to do this. Report the location of the longest line, and its length, for the major general file. In case of a tie, you can report just one of the winners.

For each of your homework programs, turn in a copy of the program, and a copy of the output.


Programs we might discuss:


Last revised on 19 July 2012.