"FinalPrices" data... The source CD (LH6113HJ3102 3476 DO Label COMIT Prices) bore the following files: D:\Dispatch Prices\1998 Dispatch Prices - All nodes.zip D:\Dispatch Prices\1999 Dispatch Prices - All nodes.zip D:\Dispatch Prices\2000 Dispatch Prices - All nodes.zip D:\Dispatch Prices\2001 Dispatch Prices - All nodes.zip D:\Dispatch Prices\2002 Dispatch Prices - All nodes.zip D:\Dispatch Prices\2003 Dispatch Prices - All nodes.zip D:\Dispatch Prices\2004 Dispatch Prices - All nodes 01Jan-30 Apr.zip D:\Final Prices - All Nodes\1996 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\1997 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\1998 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\1999 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\2000 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\2001 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\2002 Final Prices - All nodes.zip D:\Final Prices - All Nodes\2003 Final Prices - All Nodes.zip D:\Final Prices - All Nodes\2004 Final Prices - All nodes - Jan-Apr.zip D:\Final Reserve Prices\1996 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\1997 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\1998 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\1999 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\2000 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\2001 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\2002 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\2003 Final Reserve Prices - All Nodes.zip D:\Final Reserve Prices\2004 Final Reserve Prices - All Nodes Jan-Apr.zip D:\Forecast Prices\Forecast Prices 2000 - All nodes 03Jul-31Dec.zip D:\Forecast Prices\Forecast Prices 2001 - All nodes.zip D:\Forecast Prices\Forecast Prices 2002 - All nodes.zip D:\Forecast Prices\Forecast Prices 2003 - All nodes.zip D:\Forecast Prices\Forecast Prices 2004 - All nodes 01Jan-30Apr.zip D:\Forecast Prices\Reserve Forecast Prices 2000 - All nodes 03Jul-31Dec.zip D:\Forecast Prices\Reserve Forecast Prices 2001 - All nodes.zip D:\Forecast Prices\Reserve Forecast Prices 2002 - All nodes.zip D:\Forecast Prices\Reserve Forecast Prices 2003 - All nodes.zip D:\Forecast Prices\Reserve Forecast Prices 2004 - All nodes 01Jan-30Apr.zip The initial plan is to deal with the second collection, the D:\Final Prices - All Nodes\ group, which generated: C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Apr_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Aug_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Dec_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Feb_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Jan_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Jul_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Jun_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Mar_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_May_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Nov_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Oct_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Apr2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Aug2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Dec2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Feb2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Jan2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Jul2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Jun2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Mar2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_May2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Nov2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Oct2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Sep2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Sep_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200201.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200202.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200203.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200204.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200205.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200206.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200207.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200208.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200209.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200210.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200211.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200212.csv.Z C:\Nicky\HH\FPCD\Final Prices - All Nodes\1996 Final Prices\Final_1996_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1996 Final Prices\Final_1996_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1996 Final Prices\Final_1996_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1997 Final Prices\Final_1997_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1998 Final Prices\Final_1998_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\1999 Final Prices\Final_1999_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\2000 Final Prices\Final_2000_12.csv On dealing with the second-order file compression, and moving the content of the sub-directories into the main, the horde became: C:\Nicky\HH\FPCD\Final Prices - All Nodes\2004 Final Prices - All nodes - Apr.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final prices Feb 2004.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final prices Jan 2004.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final prices Mar 2004.TXT C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1996_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1996_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1996_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1997_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1998_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_1999_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_01.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_02.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_03.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_04.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_05.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_06.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_07.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_08.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_09.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_10.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_11.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_2000_12.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Apr_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Aug_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Dec_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Feb_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Jan_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Jul_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Jun_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Mar_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_May_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Nov_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Oct_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Apr2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Aug2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Dec2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Feb2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Jan2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Jul2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Jun2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Mar2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_May2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Nov2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Oct2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Prices_Sep2003.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\Final_Sep_2001.csv C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200201.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200202.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200203.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200204.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200205.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200206.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200207.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200208.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200209.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200210.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200211.csv.txt C:\Nicky\HH\FPCD\Final Prices - All Nodes\fp200212.csv.txt The first annoyance was easily dealt with. Some file names contained spaces, and as this causes trouble with the scanning of commands the file names were changed slightly rather than escalate to the annoyances of quoting such file names, especially when such quotes are not supplied by the likes of the "DIR" command. This re-naming also caused the files to appear closer to their associates in a sorted list of names. Old New 2004 Final Prices - All nodes - Apr.csv 2004_Final_Prices-AllNodes-Apr.csv Final prices Feb 2004.csv Final_prices_Feb2004.txt Final prices Jan 2004.txt Final_prices_Jan2004.txt Final prices Mar 2004.TXT Final_prices_Mar2004.txt Thus the Feb, Jan, and Mar 2004 files joined the other files with month names. This of course means that they're not in chronological order by month, and of course, with the year last, they're not in chronological order by year either, by contrast to other files such as Final_2000_06 et al. The obvious is only obvious to some, obviously. The first files inspected contained data such as ABY0111,01/10/1996,1,55.14,,F ABY0111,01/10/1996,2,43.95,,F ABY0111,01/10/1996,3,42.71,,F So processing proceeded on that basis. The suffix of ",,F" was unexplained: it was declared a "tail" and any twitches were monitored. Supposedly the "F" signified the "finality" of the "final price". Then suddenly, as an attempt was made on the whole collection, file Final_Apr_2001.csv was found to present fresh ingenuity. Observe the start: ABY0111,1/04/2001,1,F,59.34,Y,2/04/2001 15:10:31 ABY0111,1/04/2001,2,F,59.2,Y,2/04/2001 15:10:31 ABY0111,1/04/2001,3,F,58.81,Y,2/04/2001 15:10:31 The fourth field, after the HH number in the third field, is no longer the HH value but a code, followed by a floating-point number, then a different style of tail-end bumphf. So... if the fourth field does not contain a number, skip it and proceed as before with the fifth field. And in passing, note the date format miscegnation: the month number has a leading zero, but the day number does not. In this file. The fifth field holds the value in files for 2001, 2003 and 2004 (January - March) but the other files (eg for 2002) have it in the fourth field. The next fun development was to discover the "improper dates" part-way through certain files, such as Final_Prices_Apr2003.csv whose dates were of the form 4/1/2003 then 4/2/2003, etc. Clearly, the order for these files was Month/Day/Year, rather than Day/Month/Year as used elsewhere. At first, it seemed possible to deduce the order by relying on the first date in the file being for the first day of the month, but this won't work for January since a day/month of "1/1" is no different from a month/day of "1/1". Observing the transition from the first date to the second might work, but enough: a command "SwapDM" was added to Gnash, that swaps the day and month parts of any dates it spots in a line and so revised files were produced, with a name change while I was at it as well: Input Output. SwapDM Final_Prices_Jan2003.csv fp2003m01.csv SwapDM Final_Prices_Feb2003.csv fp2003m02.csv SwapDM Final_Prices_Mar2003.csv fp2003m03.csv SwapDM Final_Prices_Apr2003.csv fp2003m04.csv SwapDM Final_Prices_May2003.csv fp2003m05.csv SwapDM Final_Prices_Jun2003.csv fp2003m06.csv SwapDM Final_Prices_Jul2003.csv fp2003m07.csv SwapDM Final_Prices_Aug2003.csv fp2003m08.csv SwapDM Final_Prices_Sep2003.csv fp2003m09.csv SwapDM Final_Prices_Oct2003.csv fp2003m10.csv SwapDM Final_Prices_Nov2003.csv fp2003m11.csv SwapDM Final_Prices_Dec2003.csv y2003m12.csv Similarly, the other files with the standard dates were also renamed to produce a consistent sequence, as follows: Rename 2004_Final_Prices-AllNodes-Apr.csv fp2004m04.csv Rename Final_1996_10.csv fp1996m10.csv Rename Final_1996_11.csv fp1996m11.csv Rename Final_1996_12.csv fp1996m12.csv Rename Final_1997_01.csv fp1997m01.csv Rename Final_1997_02.csv fp1997m02.csv Rename Final_1997_03.csv fp1997m03.csv Rename Final_1997_04.csv fp1997m04.csv Rename Final_1997_05.csv fp1997m05.csv Rename Final_1997_06.csv fp1997m06.csv Rename Final_1997_07.csv fp1997m07.csv Rename Final_1997_08.csv fp1997m08.csv Rename Final_1997_09.csv fp1997m09.csv Rename Final_1997_10.csv fp1997m10.csv Rename Final_1997_11.csv fp1997m11.csv Rename Final_1997_12.csv fp1997m12.csv Rename Final_1998_01.csv fp1998m01.csv Rename Final_1998_02.csv fp1998m02.csv Rename Final_1998_03.csv fp1998m03.csv Rename Final_1998_04.csv fp1998m04.csv Rename Final_1998_05.csv fp1998m05.csv Rename Final_1998_06.csv fp1998m06.csv Rename Final_1998_07.csv fp1998m07.csv Rename Final_1998_08.csv fp1998m08.csv Rename Final_1998_09.csv fp1998m09.csv Rename Final_1998_10.csv fp1998m10.csv Rename Final_1998_11.csv fp1998m11.csv Rename Final_1998_12.csv fp1998m12.csv Rename Final_1999_01.csv fp1999m01.csv Rename Final_1999_02.csv fp1999m02.csv Rename Final_1999_03.csv fp1999m03.csv Rename Final_1999_04.csv fp1999m04.csv Rename Final_1999_05.csv fp1999m05.csv Rename Final_1999_06.csv fp1999m06.csv Rename Final_1999_07.csv fp1999m07.csv Rename Final_1999_08.csv fp1999m08.csv Rename Final_1999_09.csv fp1999m09.csv Rename Final_1999_10.csv fp1999m10.csv Rename Final_1999_11.csv fp1999m11.csv Rename Final_1999_12.csv fp1999m12.csv Rename Final_2000_01.csv fp2000m01.csv Rename Final_2000_02.csv fp2000m02.csv Rename Final_2000_03.csv fp2000m03.csv Rename Final_2000_04.csv fp2000m04.csv Rename Final_2000_05.csv fp2000m05.csv Rename Final_2000_06.csv fp2000m06.csv Rename Final_2000_07.csv fp2000m07.csv Rename Final_2000_08.csv fp2000m08.csv Rename Final_2000_09.csv fp2000m09.csv Rename Final_2000_10.csv fp2000m10.csv Rename Final_2000_11.csv fp2000m11.csv Rename Final_2000_12.csv fp2000m12.csv Rename Final_Apr_2001.csv fp2001m04.csv Rename Final_Aug_2001.csv fp2001m08.csv Rename Final_Dec_2001.csv fp2001m12.csv Rename Final_Feb_2001.csv fp2001m02.csv Rename Final_Jan_2001.csv fp2001m01.csv Rename Final_Jul_2001.csv fp2001m07.csv Rename Final_Jun_2001.csv fp2001m06.csv Rename Final_Mar_2001.csv fp2001m03.csv Rename Final_May_2001.csv fp2001m05.csv Rename Final_Nov_2001.csv fp2001m11.csv Rename Final_Oct_2001.csv fp2001m10.csv Rename Final_prices_Feb2004.txt fp2004m02.csv Rename Final_prices_Jan2004.txt fp2004m01.csv Rename Final_prices_Mar2004.txt fp2004m03.csv Rename Final_Sep_2001.csv fp2001m09.csv Rename fp200201.csv fp2002m01.csv Rename fp200202.csv fp2002m02.csv Rename fp200203.csv fp2002m03.csv Rename fp200204.csv fp2002m04.csv Rename fp200205.csv fp2002m05.csv Rename fp200206.csv fp2002m06.csv Rename fp200207.csv fp2002m07.csv Rename fp200208.csv fp2002m08.csv Rename fp200209.csv fp2002m09.csv Rename fp200210.csv fp2002m10.csv Rename fp200211.csv fp2002m11.csv Rename fp200212.csv fp2002m12.csv These names are boring, but adjacency is upheld. Further, they now appear in chronological order. However, the data within the files is not itself chronological. In file fp2001m04.csv there appeared data for LFD1102 on 9/4/2001, this being its first appearance, but then there followed data for 6/4/2001 for LFD1102, provoking Gnash to complain about shifting the base day back. Since data for 5/4/2001 was followed by 9/4/2001, then 6/4/2001 and there was no data for 9/4/2001 between that for 8/4/2001 and 10/4/2001, the data for 9/4/2001 was shifted there. And still there is obstructionism! File fp2004m01.csv starts with PRICE_GIP_GXP_FULL,PRICE_DATE,PRICE_PERIOD,PRICE_TYPE,PRICE,PRICE_RUN_TIME ABY0111,01/01/2004,1,F,31.2,02/01/2004 11:13:43 File fp2004m02.csv starts with PRICE_GIP_GXP_FULL,PRICE_DATE,PRICE_PERIOD,PRICE_TYPE,PRICE,PRICE_RUN_TIME APS0111,01/02/2004,30,F,47.41,02/02/2004 07:46:29 File fp2004m03.csv starts with PRICE_GIP_GXP_FULL,PRICE_DATE,PRICE_PERIOD,PRICE_TYPE,PRICE,PRICE_RUN_TIME ABY0111,01/03/2004,1,F,0.05,02/03/2004 08:36:35 But file fp2004m04.csv starts as do the earlier files: ABY0111,01/04/2004,1,19.96,02/04/2004 12:01:45,F Describing the data is an excellent protocol, but it works better if it is done every time! Otherwise it is just a damn nusiance. The simplest response is to uphold conformity by conforming to the style of the majority of data files, which lack these initial decorations, nor are they particularly explanatory. But it would be a pity to reject a token attempt at being helpful, so, since Gnash happens to employ a % symbol to signify a comment, in this case the annoyance was abated by prefixing the first line with a % character. This causes the line to be ignored, and so it will be the second line that is inspected in an attempt to identify the format. However,in later developments for other data, yet further file formats were presented, forcing a more elaborate inspection of file formats so that headers became helpful, so these headers were uncommented to be recognised in the more complex scheme. And notice that the HH value field is number five, not four, but this annoyance is already accommodated thanks to the 2001 interlude of such data. Now consider the first few lines of file fp2001m09.csv (file Final_Sep_2001.csv as was) ABY0111 1/09/2001 1 F 83.35 Y 3/09/2001 11:28:04 ABY0111 1/09/2001 2 F 79.4 Y 3/09/2001 11:28:04 ABY0111 1/09/2001 3 F 79.46 Y 3/09/2001 11:28:04 As can be seen, despite the .csv suffix, there are no commas separating the values. They are instead in nice neat fixed columns (though aligned left is not so good), and such data are particularly easy to read, and it would have been much easier if all the other data were in this form, but they aren't. And the "aligned left" should have been a giveaway: instead of commas separating the fields, there are tab characters. So here is the sample with spaces instead. ABY0111 1/09/2001 1 F 83.35 Y 3/09/2001 11:28:04 ABY0111 1/09/2001 2 F 79.4 Y 3/09/2001 11:28:04 ABY0111 1/09/2001 3 F 79.46 Y 3/09/2001 11:28:04 Converting the 2,099,520 tabs to commas was the work of moments. ABY0111,1/09/2001,1,F,83.35,Y,3/09/2001 11:28:04 ABY0111,1/09/2001,2,F,79.4,Y,3/09/2001 11:28:04 ABY0111,1/09/2001,3,F,79.46,Y,3/09/2001 11:28:04 Conformity is good with computers. And still the quality of ingenuity is not strained! File fp2003m12.csv exemplifies indecision. One might hope that these data would be rolled forth in some coherent sequence, but no. Since Gnash implements random-access for its storage, the days need not be presented in order, similarly, the HH values for a day need not arrive in sequence. But this was not enough. Data for a supplier may not arrive in one piece, so now, the programme holds storage for one day for each supplier so that values may arrive either in the reasonable order of all values for each supplier, one supplier followed by the next, or, in the less-reasonable order of hh1 for each supplier, then hh2 for each supplier, etc. When a datum for a supplier arrives that has a date other than that of the data in hand, then the held data is saved, and preparations are made for a new day for that supplier. The hope is that by then, all of its values would have been encountered. This also was too much to hope for, a drab constraint that presented yet a further opportunity for an exercise in ingenuity. It was surmounted with ease. Accordingly, programme Gnash was extended to acknowledge missing data by reserving a special value to signify "not a value", or "bad". Thus, a day about to be stored may contain "bad" values (because no datum had been supplied) for particular half-hours and likewise, there may be already in storage some data for that day. If so, the two are compared: for each half hour, should the stored datum be good, an incoming "bad" value does not replace it, conversely, should the previously-stored value be "bad" then the incoming value (hopefully good) replaces it. Only the "ZAP" command can replace a good value by a bad value, in a different process. Subsequently, a feature: routine Swallow juggles good and bad values according to Input.BadValuesPrevail. Thus, a day can be constructed piecemeal, as the incoming data directs, and Gnash now maintains a version count for each day, though not yet for each half-hour. (ha ha; see below) And with this apparatus in place, it turned out that some revisions effected no change, others changed afresh what had been previously changed. The flexibility enabled by all the redundant information in each record allows disorganisation, and it was not avoided but revelled in. File fp2003m12.csv escalated to a further level: some data does not have a code of "F" but "V", however such "V" entries are all later revised by "F"-associated values (at least in this file), not that a "Final" value is immune from assault... The sequence annotated below demonstrates these little conceits. For legibility, I have aligned the columns. And notice in passing the appearance here of the AM/PM style indications rather than the twenty-four hour clock used previously. As well, the dates were originally supplied as month/day/year, but this is now only a trivial obstruction. Line Supplier Date HH Value Concoction timestamp. 400612: BDE0111,28/12/2003,48,V, 42.36,Y,29/12/2003 10:13:35 AM HH48 for BDE0111; code V, not F. 13: BEN0161,28/12/2003, 1,V, 0 ,Y,29/12/2003 10:13:35 AM