Thursday, June 14, 2012

File IO Frustration!

Hey Readers,

It's been a litle while since my last post. I had a hell of a time trying to get my tool working which would split up the SRTM data into custom sized chunks. I should start from the beginning though.

So the first step was downloading all of the SRTM data. The data is stored in archives which are 5 degrees by 5 degrees and 6000x6000 resolution. What I wanted to do was  write a program which could process all of this raw data. Ideally the plan was to split the data into an image pyramid of sorts where at the top we would have 90 degree by 90 degree chunks down to the lowest level being 1 degree by 1 degree. My thought was to store the data in 16 bit 1 channel dds files with a resolution of 1024x1024. I figured that was a good size for GPU consumption. I started writing a program that would make a system call to use winrar to decompress the data into a temp directory and then clean it up afterwards so that I wouldn't have to decompress all the data at once.

The first problem I ran into was that the raw SRTM data was an ascii format. This meant that parsing it took forever. Since I wanted to be able to parse it quickly to convert to different resolutions I needed to write a tool which would do the conversion. So I wrote a quick tool which parsed the ascii file and converted it to binary and then rezipped the archive for more compact storage. This also cut the filesize quite a bit though the zipped archives stayed around the same size. The tool averages the pixels when downsampling by sampling all the possible pixels taht would cover an area and just doing an average. It currently only takes into consideration downsampling, I'd like to add proper upsampling via bicubic filtering in the future.

While I was writing my ascii to binary converter I came across two facts. One that the srtm data I was using wasn't able to be used in any proprietary software and two that the free to use SRTM data was already in a binary format. I decided to stick with what I've got for now and just make the pipeline flexible enough that I can drop in a different dataset down the road. I wanted to set it up as flexible as I could in case anyone wants to add custom datasets.

Once I had all the data converted to binary I started work on the tool for generating the final DDS files I wanted to use. My first stab at it kinda worked, but it seemed to be reusing the same data for multiple tiles. I started dumping lots of debug text trying to look for where the indexing could be going wrong. I tried clearing out the source data at the beginning of each copy, this got rid of the duplicate tiles but didn't explain why the proper data wasn't coming through. The loading was successful but for some reason when it loaded the data from the file the data just wasn't coming through.

This had me scratching my head, my initial thoughts were that maybe the file wasn't fully unpacked from the zip file when I started loading it or conversly maybe it was being half erased by the cleanup operation of the previous data.  This was a bad week as my 3 month old son was sick so I didn't get any sleep.

After a week of hammering different things trying to get the data to come through I still wasn't any closer to an answer. What a set back! I didn't understand what could be going on. The solution was stupid, it turned out there must be some kind of problem with c style file pointers when opening that many large files in succession. I still don't have an answer for why it wasn't working, I was properly closing the files after I was done with them so it wasn't that. Anyways converting my FILE* to ifstreams did the trick, it immediately started working as I expected aside from a grid of black lines which was just one pixel in width and height that needed to be tacked on to my downsampling .

So now I finally have an earth size dataset at different resolutions. The next step is to change my elevation quad trees to start sampling from the different levels of textures the closer the camera gets to the earth.