brc_data, a MATLAB code which creates a file of randomly generated temperature readings to be associated with weather stations at 413 world cities. A file like this, with 1 billion records, is used for the billion row challenge (BRC).
The billion row challenge asks for a program that reads the temperature data and reports, for each weather station, the minimum, mean, and maximum of all the temperature readings given at that site. The challenge is to do this as fast as possible.
A file created by this program consists of n rows. Each row lists a city name representing a weather station, a semicolon, and then a temperature measurement. Here is a sample of a few lines from such a file:
Kansas City;-0.8 Damascus;19.8 Kansas City;28.0 La Ceiba;17.0 Darwin;29.1 New York City;2.1As you might guess, the temperature is in Centigrade. Moreover, the city names appear in random order, and may be repeated.
brc_data nwhere
The information on this web page is distributed under the MIT license.
brc_data is available in a C version and a C++ version and a MATLAB version.
brc_naive, a MATLAB code which reads a file of randomly generated temperature readings, associated with weather stations at 413 world cities, and computes the minimum, mean, and maximum temperature for each weather station. It also reports the total execution time. Processing such a data file with one billion records is the substance of the billion record challenge (BRC).