Friday, February 18, 2005

Optimization continues


After a rather hectic week, at last Friday is here. Of course, Fridays are not so nice, because I work from 8 am to 8 pm, but after that I have the whole weekend in front of me for doing some more work! During the week I did not have much time to optimize the volume rendering algorithm much. I have prepared a nice picture (I hope you like it) of the Chapel Hill Dataset. It is shown here at half the actual size (when you click on it) and cropped. The actual image is 700 x 650 pixels and it takes about 9.5 secs to draw. This is way too much, so I will have to get really creative with the optimization tricks. The volume is 208 x 256 x 225 voxels in size (rather small). I have applied a 2D transfer function to show the bones and soft tissues.
I did some more web searching for Delphi optimization and found a really nice site (but with a few links broken). You can find it here. I applied the following changes, and the speed-up is significant:
1. Changed order of nested 'for' loops that loop over a three-dimensional array.
2. Changed FPU precision mode to: SetPrecisionMode(pmSingle);
3. Used multiplication instead of division. I multiply by the inverse, and this is much faster.
4. Turned off range checking and overflow checking (this is rather obvious, but do it after making sure the algorithm works OK): {$R-}{$Q-}
5. Changed sequence of instructions, to group together instructions that use the same variables.
6. Changed variables to smaller size (e.g. single instead of double, byte instead of word), where possible.
7. In-lined small procedures.
Changes that I thought would be beneficial but were not:
1. Changing the 'for' variable to anything else than integer slowed the loop down.
Changes I would like to make but have not figured out how, yet:
1. Substitute 'sqrt' with something faster (even if it is only approximate).
2. Same for 'trunc'.
3. Get rid of cache misses. This seems to be the major problem (I knew that, of course).
It seems that I will not be able to get interactive rates and keep the quality of the image at the same level. So I will probably have to resort to tricks, like drawing at reduced quality when rotating or translating the volume. I could not ask the user to get a faster machine, because I am using a Pentium 4 running at 3 GHz with 1 Gb of RAM. So, this is a fast machine already (for today, at least). Most of the profiling was done by using the QueryPerformanceCounter call. I know that this is not very accurate, but it gives a good estimate if you take the average of a few runs. I have tried VTune but there is a significant learning curve and I did not have much time to go into great depth. Looks promising though.

Sunday, February 13, 2005

Ray casting without errors

At last I have managed to complete the ray casting procedure in Delphi and it runs without errors. Now I need to make it much faster. I am using the data set from the Institute for Anthropology, Univeristy of Vienna, as one of my benchmarks. You can download it from here. The dataset is of a cranium without the mandible (too bad), at two resolutions. I am using the low resolution version (voxel size of 1 mm, total voxels: 218 x 218 x 142). In a window of approximately 470 x 470 pixels, it takes more than 6 seconds to draw. My target is to reduce this to less than a sec. I guess I could do it if I incorporate the IsoRegion idea. However, this particular dataset is rather noisy and the empty space around the cranium is not so empty. In contrast, the CT scan from the Chapel Hill Volume Rendering Test Dataset (get it here) is much 'smoother' and should benefit from IsoRegion leaping significantly.

Thursday, February 10, 2005

Chocolates from Geneva

Yesterday I got an unexpected gift, a small box of chocolates from Geneva. They are Du Rhone. The gesture was very kind and I am grateful. The chocolates were fine but I was not ecstatic. I remember that the best chocolates I have tasted were bought in Geneva on rue du Mont Blanc, two years ago. I will try to find some time and get some more this April, when I will be there. They were fantastic (but the price was also on the same level).
Meanwhile, the volume rendering procedure in Delphi is moving along. I have weeded out a few bugs and I think I can have it working in the next few days. Then it will be a matter of optimizing the code heavily. Those who are interested in volume rendering should definitely look into Stefan Bruckner's Master's thesis. I got some nice ideas from there, but I am not implementing it exactly as he proposes. I have kept the idea of subdividing the whole volume into bricks but I am not using his rather complex scheme of addressing. Instead, I have decided to duplicate some of the voxels (i.e. have the bricks slightly overlapping) so that calculation of gradients is easier (and hopefully faster). I am also thinking of implementing the idea of IsoRegion leaping, that I read about in a paper by Fung and Heng. A preliminary test showed that I should get a speed-up of a factor of 3 or 4, which is very significant.

Wednesday, February 09, 2005

Delphi optimization

I am trying to code a volume renderer, to display computed tomography data. I have been struggling with it for some weeks now and have managed to produce a ray casting procedure, but it is still very slow (I will describe this in another posting later on). Searching the Internet I have come across some rather advanced (for me at least) optimization strategies for Delphi, which may interest some of you. Use Google to look for 'CodingForSpeedInDelphi.doc' by Dennis Christensen. It mentions VTune, a software by Intel that can be used to check for cache misses, cache thrashing and other esoteric stuff. Intel gives the software for a trial period of one month, but be prepared for a very long download time (size is about 200Mb!). I haven't tried it yet, but meantime I have applied some of the advice in the Christensen document (and in a ppt file, also found be Google: singlepe_optimize.ppt). By re-ordering some 'for' loops and substituting divisions by multiplications of the reciprocal, I have managed to shave off approximately 1 full second from the ray casting procedure (which amounts to 10% of the total). Now, if only I could get rid of those annoying Out of Range errors!

Tuesday, February 08, 2005

Welcome!

I start this blog with great reservations because I doubt it if I am going to keep it up for long. Anyway, I decided to experiment and see. This blog is for those who are interested in teeth and chocolate (not necessarily in that order, and not in the restricted sense of the words). I am into orthodontics, software development, teaching, and eating chocolates, besides other trivial pastimes. Postings here will contain info on any topic that comes to mind, in the hope of contributing to the ever expanding amount of useless knowledge that resides on the Internet.