Drop Early, Drop Often

Don't wait until the end of your QlikView load script to drop temporary tables—get rid of them as soon as possible and reap rewards in the form of memory, CPU, and time savings!

 
This is going to be a short post, because my advice to you today is extremely simple: drop your temporary tables as early as possible. I have to confess, for many years I was guilty of not following this advice myself. One of my favorite development techniques in extract/transform layers was to create an iterative process at the end of the load script to simultaneously store/drop all loaded tables. Recently, however, I ran into a constraint. I needed to load several large tables and simply ran out of RAM. I realized, however, that I can make QlikView release RAM by dropping tables in the model that I was not actually using anymore. This easy technique allowed my script to successfully execute, where it was failing before. However, even if your situation is not this extreme, this painless technique will allow you to save RAM and CPU cycles, which is great even if you have a private development box, but is particularly nice if you happen to be sharing a development server. By creating lean load scripts, you will become a better server co-tenant, and everyone wins. As an added bonus, QlikView performs more snappily when it has less tables in RAM to manage, and you script will likely finish sooner.
 
Here are before/after performance stats on an application that I improved using the "drop early" technique. This is not the one that I mentioned was failing earlier, and is not a particularly dramatic example. Nor is it perfect, because even in the old model I had dropped some tables early. However, you can see some significant improvements here. Most obvious are the improvements in RAM usage and load time. Both maximum and average RAM usage is much lower in the "drop early" version than in the "drop late." And we were able to shave 4 minutes off the reload time, while sacrificing absolutely nothing.
 
RAM

 
CPU

 
Happy Qliking!
This entry was posted in Development, Tips & Tricks and tagged , , . Bookmark the permalink.

9 Responses to Drop Early, Drop Often

  1. Aaron Couron says:

    In addition to the RAM and CPU savings, you also seemed to shave about 25% off the reload time. Such a simple technique to reap such a large reward. I have switched between dropping at end and during almost on a whim (or on the time I have available to develop) so now that you have quantified this I think I need to drop tables ASAP from now on. Great post.

  2. Bruno Decruynaere says:

    Wouldn’t it be good if also mapping tables could be dropped to free memory? In some applications we need tons of large mapping tables and these are only dropped when script finishes. In many cases these tables are used only once so could be dropped much earlier, but as far as I know this is not possible. As you demonstrate it could help a lot in scrip run performance.

  3. calfre2020 says:

    This post is really nice and informative. The explanation given is really comprehensive and informative..

  4. MauritzZ says:

    Hi Vlad

    Since Andrius just linked to this post again in his LinkedIn I suspect that it will get new attention. I am also struggling with a situation where I have a few mapping tables that are just getting concatenated every time I loop through a DB and my suspicion is that they must be HUGE by the end. Anyways, I was wondering how you measured the performance? I am planning on changing logic in the script and was hoping to get a benchmark profile of the current reload and then I would like to compare the new script to the old script.

    Hopefully you can still remember ;).

    Thanks for the great content!

    • Hi, Mauritz. Mapping tables are tricky because there’s no way to drop them that I know of. You can perhaps achieve a bit of savings by loading them immediately before you have to use them. Not sure how relevant that suggestion is in your case. You can measure performance like I did by creating performance counters in Windows. First, open QlikView Desktop; leave the window open but don’t open your actual app yet. Go to Administrative Tools >> Performance Monitor. Go to Data Collector Sets >> User Defined. Create a new data collector set (manually, not from a template). Select “Performance counter” then “Add”. Select the arrow next to “Process” and select the following 2 measures: (1) % Processor Time, and (2) Working Set. In the bottom left pane, select “Qv”. Then press the “Add >>” button. Change the sample interval to every 5 seconds. Press Next and pick a directory where you want your logs saved. Press Finish. You’ll now see your data collector in “Stopped” state. Before you start it, double click it. Then right-click the performance counter and select properties. Change the log format from “Binary” to “Comma Separated” and press OK. Go back up one level, right click the data collector set, and press Start. If you go to your log directory, you’ll see that a folder has been created with a CSV inside it. Now switch back to your QlikView window, open your app, and run it. Once your reload is complete, you can stop the collector set, and use QlikView itself to read/graph it. Pretty neat, eh? 🙂

      • MauritzZ says:

        Hi Vlad

        Thanks so much for the quick response. I agree that mapping tables, although very usefull are a pain when it comes to freeing up resources. I’m going to break my one app/reload which loops through DBs into different reloads for each DB and chaining them to see if I can 1) avoid getting a very large mapping table with the concatenated results which I am sure makes looking up values less performant and 2) keep the resources consumed over time less. This will be done on a server with Qlik Sense, but I am sure that I’ll be able to use and adapt your suggestion above in QS. Thanks again!

        • No problem, happy to help. If I were in your shoes, I would probably copy the script to QV for performance testing, unless you can isolate a dedicated QS engine for just your testing. Otherwise you won’t get accurate numbers. If it’s QV 12+, the performance of the script should be identical to QS.

Leave a Reply

Your email address will not be published. Required fields are marked *

Notify via email when new comments are added

Blog Home
Categories
Archives