ISWR
Statistical Datasets

ISWR is a dataset directory which contains example datasets used for statistical analysis.

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

Related Data and Programs:

CENSUS, a dataset directory which contains US census data;

DRAFT_LOTTERY, a dataset directory which contains the numbers assigned to each birthday, for the Selective Service System lotteries for 1970 through 1976.

HARTIGAN, a dataset directory which contains datasets for testing clustering algorithms;

MARTINEZ, a dataset directory which contains datasets for computational statistics, including cluster analysis;

MDS, a dataset directory which contains datasets for M-dimensional scaling;

PCL, a dataset directory which contains datasets from a gene expression experiment on Arabidopsis, which are candidates for data cluster analysis;

REGRESSION, a dataset directory which contains datasets for testing linear regression;

SGB, a dataset directory which contains files used as input data for demonstrations and tests of Donald Knuth's Stanford Graph Base.

SOKAL_ROHLF, a dataset directory which contains biological datasets considered by Sokal and Rohlf.

SPAETH, a dataset directory which contains datasets for cluster analysis;

SPAETH2, a dataset directory which contains datasets for cluster analysis;

TIME_SERIES, a data directory of examples of time series, which are simply records of the values of some quantity at a sequence of times.

TRIOLA, a dataset directory which contains datasets used for statistical analysis.

WORDS, a dataset directory which contains lists of words;

Reference:

Peter Dalgaard,
Introductory Statistics with R,
Springer, 2008,
ISBN13: 978-0-387-79053-4,
LC: QA276.45.R3.D35.

Datasets:

The examples are available in CSV (Comma Separated Value) format:

alkfos.csv, repeated measurements of alkaline phosphatase in breast cancer patients.
ashina.csv, effect of an NO synthase inhibitor on headaches.
bcmort.csv, measuring the effect of screening for breast cancer.
bp.obese.csv, sex, weight, and blood pressure.
caesar.csv, caesarean section versus shoe size.
coking.csv, oven width, temperature, and time to coking.
cystfibr.csv, lung function measurements for cystic fibrosis patients.
eba1977.csv, lung cancer incidence in four Danish cities.
energy.csv, energy expenditure measurements for groups of lean and obese women.
ewrates.csv, rates of lung and nasal cancer mortality, and all causes.
fake.trypsin.csv, serum levels of trypsin in healthy volunteers.
graft.vs.host.csv, data from patients receiving a bone marrow transplant.
heart.rate.csv, measurements for patients before and after receiving treatment.
hellung.csv, growth of Tetrahymena cells.
igm.csv, serum IgM in 298 children, in grams per liter.
intake.csv, energy intake for 11 women.
juul.csv, insulin-like growth factor measurements for many subjects, at a sequence of ages.
juul2.csv, an extended version of the JUUL data.
kfm.csv, breast-feeding data.
lung.csv, data on three different methods of determining lung volume.
malaria.csv, results of tests on 100 children for antibody level and malaria exposure.
melanom.csv, survival of patients after an operation for malignant melanoma.
nickel.csv, health data about nickel workers.
nickel.expand.csv, an expanded version of the nickel worker data.
philion.csv, estimates of the EC50 of a biological dose-response relation.
react.csv, differences between two nurses's determination of tuberculin reaction sizes.
red.cell.folate.csv, red cell folate levels in patients receiving three different kinds of ventilation during anesthesia.
rmr.csv, resting metabolic rate for 44 women.
secher.csv, ultrasound measurements of fetuses immediately before birth, and their birth weight.
secretin.csv, secretin-induced blood glucose changes.
stroke.csv, cases of stroke in Tartu, Estonia.
tb.dilute.csv, a drug test involving dilutions of tuberculin.
thuesen.csv, ventricular shortening velocity and blood glucose for type 1 diabetic patients.
tlc.csv, total lung capacity.
vitcap.csv, vital capacity for 24 workers in the cadmium industry.
vitcap2.csv, vital capacity for 84 workers in the cadmium industry.
wright.csv, comparison of Wright peak-flow meters.
zelazo.csv, age at walking for four groups of infants.

You can go up one level to the DATASETS directory.

Last revised on 29 August 2011.

ISWR Statistical Datasets

Licensing:

Related Data and Programs:

Reference:

Datasets:

ISWR
Statistical Datasets