1brc_data, a C code which creates a file of randomly generated temperature readings to be associated with weather stations at 413 world cities. A file like this, with 1 billion records, is used for the one billion row challenge.
The one billion row challenge asks for a program that reads the temperature data and reports, for each weather station, the minimum, mean, and maximum of all the temperature readings given at that site. The challenge is to do this as fast as possible.
A file created by this program consists of n rows. Each row lists a city name representing a weather station, a semicolon, and then a temperature measurement. Here is a sample of a few lines from such a file:
Kansas City;-0.8 Damascus;19.8 Kansas City;28.0 La Ceiba;17.0 Darwin;29.1 New York City;2.1As you might guess, the temperature is in Centigrade. Moreover, the city names appear in random order, and may be repeated.
1brc_data n filenamewhere
The information on this web page is distributed under the MIT license.
1brc_data is available in a C version.