Roger's Access Library Homepage
Forum Home Forum Home > Other Download Libraries > MS Access MVP Libraries > Tejpal, A.D.
  New Posts New Posts RSS Feed - Query_TopsPerSubGrpLargeData.mdb
  FAQ FAQ  Forum Search   Events   Register Register  Login Login

>
Want a good read? Try The Summer of His Life (available on Amazon!)

Query_TopsPerSubGrpLargeData.mdb

 Post Reply Post Reply
Author
Message
A.D. Tejpal View Drop Down
Microsoft MVP
Microsoft MVP


Joined: 30 Jun 2008
Status: Offline
Points: 192
Post Options Post Options   Thanks (0) Thanks(0)   Quote A.D. Tejpal Quote  Post ReplyReply Direct Link To This Post Topic: Query_TopsPerSubGrpLargeData.mdb
    Posted: 07 Jul 2008 at 11:10pm
Description: 

    1 - This sample db examines various alternatives for getting Top values (as per 
date) so as to identify the most efficient method in the context of large data. Main 
data table has test results recorded for various patients over various dates. The 
objective is to find (Top result means latest result as per date):
     (a) Top 3 Results per test per patient.
     (b) Top 2nd and 3rd Results per test per patient. 
     Note - Task (b) involves extra processing load as compared to (a)

    2 - As a query acting directly on the whole data at a time is found to be 
impractical (it is extremely slow or hangs up altogether in case of large data size), 
five alternative methods for handling each of the objectives mentioned at 1 (a) & (b) 
are demonstrated.

    3 - For facilitating experimentation with test data, form F_FillTables enables 
programmatic population of data tables as per the size desired. For this task, two 
alternative styles are demonstrated:
     (a) Based purely upon recordsets  (Very slow).
     (b) Combination of recordsets and append query (Extremely fast).
     Note - For building up test data consisting of  1200,000 records (1000 patients, 
20 tests per patient, 60 readings per test), method (a) takes 4,605 seconds while 
method (b) does it in just 210 seconds.

    4 - Performance of methods using form interface (A, AA and B to E, F, FF and G 
to K ) for getting the top results gets automatically logged in table T_Log.

    5 - For academic interest, sample queries acting directly upon the whole data, are
 also given (in series Q_Z_L to Q_Z_R). These can be tested on comparatively 
small sized data on account of slow speed and tendency to hang up if data size is 
large.

    6 - Conclusions:
     Results of certain trials are contained in table T_Conclusions. Conclusions 
based upon these are summarized below:

     6.1 - Piecemeal processing of data in convenient packets, through saved 
queries with Top clause, using filter criteria based upon global variables fed via user 
defined functions, and storage of output in an interim table (T_Result), is found to
 be fastest (Form options A, B, F and G).

     6.2 - Using reference to form controls  (Form options AA and FF) instead of user 
defined functions based upon global variables as in 4.6.1 above for performing the 
same job, results in noticeable slow-down. For instance, using option A or B, a result 
set having 40,000 Top 2nd and 3rd results based upon data having 1200,000 
records gets displayed in 581 sec, whereas, using option AA, it takes 798 sec.

     6.3 - Action similar to 6.1 above, but using SQL string built at run time (Form 
options D and J), is found to be slower than that using saved queries with filtering 
via global variables, though faster than saved queries with filtering via reference 
to form controls. For instance, using option D, a result set having 40,000 Top 2nd
and 3rd results based upon data having 1200,000 records gets displayed in 630 sec as
compared to 581 sec for options A or B, and 798 sec for option AA. 4.6.4 - Action similar to 6.1 above, but using saved query with a count subquery
(Form options C and H), is found to be very slow. For instance, using option C, a result set having 4,000 Top 2nd and 3rd results based upon data having 120,000 records takes 1,082 sec to get displayed, as compared to mere 17 sec for options A or B. 6.5 - Using a supplementary table (T_Temp) to store the filtered piecemeal records (Form options E and K), before processing the same and appending to results table (T_Result), does not offer any advantage. Instead, it takes almost double the time for options A and F. 6.6 - Queries acting directly upon the whole data at a time are all found very slow and prone to hang up in case of larger data size. For academic information, some of the comparative results, based upon data consisting of 24,000 records are furnished below: 6.6.1 - Top 3 results - 1200 records in result set. (This task, if performed by form option F, takes MERE 2 Seconds). (a) Query: Q_Z_L_Top3_TotalsQry_TopInHaving - 150 Sec (First screenfull appears in 5 Sec). (b) Query: Q_Z_M_Top3_TotalsQry_TopInWhere - 140 Sec (All Results at completion only). 6.6.2 - Top 2nd and 3rd results - 800 records in result set. (This task, if performed by form option A, takes MERE 3 Seconds). (a) Query: Q_Z_N_Top2nd3rd_PlainQryWithCountSubQuery - 212 Sec (All Results at completion only). (b) Query: Q_Z_O_Top2nd3rd_TotalsQryCountInHaving - 217 Sec (First screenfull appears in 10 Sec) (c) Query: Q_Z_P_Top2nd3rd_TotalsQryCountInWhere - 210 Sec (All Results at completion only). (d) Query: Q_Z_R_Top2nd3rd_PlainQryWithUDF - 130 Sec (All Results at completion only). Versions - Access 2000 File Format (Developed on Access 2003. Also tested on Access XP & 2K)

 

Download Access 2000 format: uploads/37/Query_TopsPerSubGrpLargeData.zip
--

Email: adtejpal@gmail.com
Back to Top
Sponsored Links
>
Want a good read? Try The Summer of His Life (available on Amazon!)

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 12.03
Copyright ©2001-2019 Web Wiz Ltd.