csv_io, R programs which illustrate how one can read or write comma separated value (CSV) files.
Briefly, R includes a special command read.csv() specifically for reading CSV files:
name <- read.csv ( "filename.csv", header = TRUE or FALSE )or, the more general read.table() function can be used:
name <- read.table ( "filename.csv", header = TRUE or FALSE, sep = "," or sep = " " or sep= "\t" )
The simplest example of a comma separated value file involves numeric data, with the same number of values occurring on each row. Thus, we might have three rows of data, each containing 10 values, with successive values separated by commas (which R specifies by sep=","):
1,2,3,4,5,6,7,8,9,10 11,12,13,14,15,16,17,18,19,20 21,22,23,24,25,26.27.28.29,30
Variations on this approach use the TAB character (which R specifies by sep="\t") to separate the values:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30or one or more spaces (which R specifies by sep=" ":
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Sometimes the first row of a CSV file is used to give labels to the columns. In this case, the command to read the file should specify header=TRUE:
"Age", "Weight in Pounds" 0, 8.4 1, 16.8 2, 24.5 3, 37.0 4, 49.0 5, 63.0
A data frame is a table of data that may include row or column labels as well. For instance, the FAMILY example defines a data frame of 4 properties of 7 individuals, which can be represented as the following table, where items in bold face are labels:
First name | Age | Height (inches) | Weight (pounds) | |
---|---|---|---|---|
Dad | Horace | 45 | 71 | 230 |
Mom | Gladys | 43 | 66 | 158 |
Big Bro | Marco | 22 | 73 | 190 |
Big Sis | Desiree | 20 | 69 | 150 |
Me | Arnold | 17 | 69 | 165 |
Punk Sis | Darleen | 15 | 67 | 120 |
Dog | Blotch | 3 | 12 | 15 |
The computer code and data files on this web page are distributed under the MIT license
csv_io is available in a FORTRAN90 version and an R version.
XLS_IO R programs which illustrate how data can be shared between Microsoft EXCEL and R, using XLS and CSV files.
PLAYERS is an example involving the highest salaried football players. The data includes first and last name, team, position, and salary, for 5 players. The file includes an initial row of column labels. The data is separated by commas.
NUMBERS is an example in which no column or row labels are used, and all the data is numeric, consisting of 3 rows of 10 numbers. The basic file uses commas to separate the data, but two related files use tabs or spaces instead.
FAMILY is an example in which column and row labels are used. In fact, the default labels used by R are replaced by user-chosen values. The values and labels are written to the CSV file, and read back.