ISWR 
 Statistical Datasets
    
    
    
      ISWR
      is a dataset directory which
      contains example datasets used for statistical analysis.
    
    
      Licensing:
    
    
      The computer code and data files described and made available on this web page
      are distributed under
      the GNU LGPL license.
    
    
      Related Data and Programs:
    
    
      
      CENSUS,
      a dataset directory which
      contains US census data;
    
    
      
      DRAFT_LOTTERY,
      a dataset directory which
      contains the numbers assigned to each birthday, for the Selective Service System
      lotteries for 1970 through 1976.
    
    
      
      HARTIGAN,
      a dataset directory which
      contains datasets for testing clustering algorithms;
    
    
      
      MARTINEZ,
      a dataset directory which
      contains datasets for computational statistics,
      including cluster analysis;
    
    
      
      MDS,
      a dataset directory which
      contains datasets for M-dimensional scaling;
    
    
      
      PCL,
      a dataset directory which
      contains datasets from a gene expression experiment on Arabidopsis,
      which are candidates for data cluster analysis;
    
    
      
      REGRESSION,
      a dataset directory which
      contains datasets for testing linear regression;
    
    
      
      SGB,
      a dataset directory which
      contains files used as input data for
      demonstrations and tests of Donald Knuth's Stanford Graph Base.
    
    
      
      SOKAL_ROHLF,
      a dataset directory which
      contains biological datasets considered by Sokal and Rohlf.
    
    
      
      SPAETH,
      a dataset directory which
      contains datasets for cluster analysis;
    
    
      
      SPAETH2,
      a dataset directory which
      contains datasets for cluster analysis;
    
    
      
      TIME_SERIES,
      a data directory of examples of time series,
      which are simply records of the values of some quantity at
      a sequence of times.
    
    
      
      TRIOLA,
      a dataset directory which
      contains datasets used for statistical analysis.
    
    
      
      WORDS,
      a dataset directory which
      contains lists of words;
    
    
      Reference:
    
    
      
        - 
          Peter Dalgaard,
 Introductory Statistics with R,
 Springer, 2008,
 ISBN13: 978-0-387-79053-4,
 LC: QA276.45.R3.D35.
      Datasets:
    
    
      The examples are available in CSV (Comma Separated Value) format:
      
        - 
          alkfos.csv,
          repeated measurements of alkaline phosphatase in breast cancer patients.
        
- 
          ashina.csv,
          effect of an NO synthase inhibitor on headaches.
        
- 
          bcmort.csv,
          measuring the effect of screening for breast cancer.
        
- 
          bp.obese.csv,
          sex, weight, and blood pressure.
        
- 
          caesar.csv,
          caesarean section versus shoe size.
        
- 
          coking.csv,
          oven width, temperature, and time to coking.
        
- 
          cystfibr.csv,
          lung function measurements for cystic fibrosis patients.
        
- 
          eba1977.csv,
          lung cancer incidence in four Danish cities.
        
- 
          energy.csv, 
          energy expenditure measurements for groups of lean and obese women.
        
- 
          ewrates.csv, 
          rates of lung and nasal cancer mortality, and all causes.
        
- 
          fake.trypsin.csv,
          serum levels of trypsin in healthy volunteers.
        
- 
          graft.vs.host.csv,
          data from patients receiving a bone marrow transplant.
        
- 
          heart.rate.csv,
          measurements for patients before and after receiving treatment.
        
- 
          hellung.csv,
          growth of Tetrahymena cells.
        
- 
          igm.csv,
          serum IgM in 298 children, in grams per liter.
        
- 
          intake.csv,
          energy intake for 11 women.
        
- 
          juul.csv,
          insulin-like growth factor measurements for many subjects, at a sequence of ages.
        
- 
          juul2.csv,
          an extended version of the JUUL data.
        
- 
          kfm.csv,
          breast-feeding data.
        
- 
          lung.csv,
          data on three different methods of determining lung volume.
        
- 
          malaria.csv,
          results of tests on 100 children for antibody level and malaria exposure.
        
- 
          melanom.csv,
          survival of patients after an operation for malignant melanoma.
        
- 
          nickel.csv,
          health data about nickel workers.
        
- 
          nickel.expand.csv,
          an expanded version of the nickel worker data.
        
- 
          philion.csv,
          estimates of the EC50 of a biological dose-response relation.
        
- 
          react.csv,
          differences between two nurses's determination of tuberculin reaction sizes.
        
- 
          red.cell.folate.csv,
          red cell folate levels in patients receiving three different kinds of ventilation
          during anesthesia.
        
- 
          rmr.csv,
          resting metabolic rate for 44 women.
        
- 
          secher.csv,
          ultrasound measurements of fetuses immediately before birth, and their birth weight.
        
- 
          secretin.csv,
          secretin-induced blood glucose changes.
        
- 
          stroke.csv,
          cases of stroke in Tartu, Estonia.
        
- 
          tb.dilute.csv,
          a drug test involving dilutions of tuberculin.
        
- 
          thuesen.csv,
          ventricular shortening velocity and blood glucose for type 1 diabetic
          patients.
        
- 
          tlc.csv,
          total lung capacity.
        
- 
          vitcap.csv,
          vital capacity for 24 workers in the cadmium industry.
        
- 
          vitcap2.csv,
          vital capacity for 84 workers in the cadmium industry.
        
- 
          wright.csv,
          comparison of Wright peak-flow meters.
        
- 
          zelazo.csv,
          age at walking for four groups of infants.
        
      You can go up one level to 
      the DATASETS directory.
    
    
    
      Last revised on 29 August 2011.