boston_housing, a dataset which stores training and test data about housing prices in Boston. This dataset is also available as a builtin dataset in keras.

The dataset is described as Housing Values in Suburbs of Boston. The fields are

  1. crim, per capita crime rate by town.
  2. zn, proportion of residential land zoned for lots over 25,000 sq.ft.
  3. indus, proportion of non-retail business acres per town.
  4. chas, Charles River dummy variable (= 1 if tract bounds river; 0 otherwise).
  5. nox, nitrogen oxides concentration (parts per 10 million).
  6. rm, average number of rooms per dwelling.
  7. age, proportion of owner-occupied units built prior to 1940.
  8. dis, weighted mean of distances to five Boston employment centres.
  9. rad, index of accessibility to radial highways.
  10. tax, full-value property-tax rate per $10,000.
  11. ptratio, pupil-teacher ratio by town.
  12. black, 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town.
  13. lstat, lower status of the population (percent).
  14. medv, median value of owner-occupied homes in $1000s.


The computer code and data files made available on this web page are distributed under the GNU LGPL license.

Related Programs:

boston_housing, a keras script which sets up a neural network to apply regression to predict housing prices, based on the Boston housing dataset.

boston_housing_external, a keras script which reads the Boston housing dataset from an external file, rather than referencing the built-in keras dataset.


Source Code:

The CSV data files include a header line. The test CSV file does NOT include the actual selling price.

The TXT data file has no header line. It includes an additional initial index field on each line.

Last revised on 24 April 2019.