Re: Datafile and Index rebuild time - limit on a 4D application

Subject :Re: Datafile and Index rebuild time - limit on a 4D application
From :Jeffrey Kain
Date :Sunday, October 6, 2013 at 7:43 AM
Link :https://kb.4d.com/resources/inug?msgid=GmailId1418e3a10cc2cd10
Memory and 64-bit are more important than number of cores. If you use the

64-bit server and an SSD I would think a 21GB database would go faster

than 3 hours, but it also depends on number of records, number of indexes

etc..



You can throw all the cores you want at this, but you'll still probably be

limited by the throughput of your disk controller. Once your disk I/O path

becomes saturated, no amount of extra cores will help anything.



One tip is to make sure your data file is compacted before indexing. A

fragmented data file will take a LOT longer to index. We find that the

time to compact a database and rebuild the indexes is usually less than

just trying to reindex a severely fragmented data file.



Another tip is to use SQL to rebuild your indexes instead of just letting

4D do it on its own:



(something like this... Going from memory)



Begin SQL

ALTER DATABASE DISABLE INDEXES;

ALTER DATABASE ENABLE INDEXES;

End SQL



Why?



If you let 4D rebuild the indexes, they get rebuilt in the order the

indexes were created. If your application is like our application, after

the core tables are indexed it starts to bounce around from table to

table, reflecting the order that you added indexes as your application

developed. With a large database, this is hard on the cache, since it has

to load all the record data for a table, build the index, then switch

tables, load different record data (and possibly purging the cache of data

from the earlier table), build the indexes, switch back to the first table

again, start loading the same data as earlier and possibly purging the

cache as it goes... and repeat a few dozen times. it's extremely

inefficient.



The SQL commands, for whatever reason, index in table order. All the

indexes for a table get built at once, so you get MUCH better use of the

cache and with just one "loading data" step per table if you have enough

memory. On a big database (and I wouldn't classify 21GB as being that big

for v12/v13 and a 64-bit server) you may save hours using the SQL commands.



Finally, consider developing a procedure for what we call parallel

compacting. Since v11, you can compact the database without invalidating

the journal file. To compact in parallel with the system in production,

restore the last backup to a different computer and compact it there while

the production system continues to run. When the compact and reindex are

finished, copy the compacted database to the production server, and bring

down the production server just long enough for it to integrate the

journal file (usually measured in minutes instead of hours). Then bring

the production server back up using the compacted and integrated database.

You need to be sure that you don't make ANY modification to the restored

database, so find a way to prevent any of your On Server Startup code from

running when you're doing a parallel compact (we check for the presence of

a folder named 'parallel' in the data file directory and use it as a flag

to prevent On Server Startup from executing).



Hope this gives you some ideas on ways to speed things up,



Jeff



On 10/6/13 1:16 AM, "David Ringsmuth" wrote:



>I'm interested to know if you've had many instances where your 4D

>application datafile recovery time has been a significant availability

>planning issue.

>

>

>

>I'm working with a 21GB datafile and had to rebuild the indexes which

>took 3

>hours on an SSD, or 6 hours on a non-SSD.





**********************************************************************

4D v13 is available now - with more than 200 new features to make

your applications richer and faster

http://www.4d.com/products/new.html



4D Internet Users Group (4D iNUG)

FAQ: http://lists.4d.com/faqnug.html

Archive: http://lists.4d.com/archives.html

Options: https://lists.4d.com/mailman/options/4d_tech

Unsub: mailto:4D_Tech-Unsubscribe@xxx.xxx

**********************************************************************

Subject :RE: Datafile and Index rebuild time - limit on a 4D application
From :David Ringsmuth
Date :Sunday, October 6, 2013 at -12:55 AM
Link :https://kb.4d.com/resources/inug?msgid=GmailId1418c5de72a5978d
Keisuke,



My tests were on a dual core box. I did not notice the second core working

much more during indexing.



My planning is for an 8 core+ box.



I'll test on an 8 core and see what the difference is.



David





-----Original Message-----

From: Keisuke Miyako [mailto:Keisuke.Miyako@xxx.xxx]

Sent: Sunday, October 06, 2013 12:12 AM

To: davidinug@xxx.xxx; 4D iNug Technical

Subject: Re: Datafile and Index rebuild time - limit on a 4D application



but the index builder is multi thread already...



2013/10/06 14:10、"David Ringsmuth" のメッセージ:



> What do you think about this feature request? Would you vote for it?





**********************************************************************

4D v13 is available now - with more than 200 new features to make

your applications richer and faster

http://www.4d.com/products/new.html



4D Internet Users Group (4D iNUG)

FAQ: http://lists.4d.com/faqnug.html

Archive: http://lists.4d.com/archives.html

Options: https://lists.4d.com/mailman/options/4d_tech

Unsub: mailto:4D_Tech-Unsubscribe@xxx.xxx

**********************************************************************