Last Update: 2007 Dec 12
Summary
I developed a rough solution which reduces the time to find similar AttributeProperty objects in a Table from 68 seconds to ~16 seconds on an AMD Duron 1.1GHz box running Windows 2000 SP4. My original time measurements were gathered while running Abiword attached to a debugger. Whoops.
Diffs can be found here: http://www.geocities.com/typopl/bug5291diff.txt
Wed Oct 1 2003 Update
After a few more patches, Abiword is no longer the laggard in import performance with documents containing large tables.
Here's a table that shows the time in seconds for importing the RTF spec using Abiword, along with each set of patches:
Fri Oct 3 2003 Update
A smaller bunch of patches were created which reduce the time for "Building document...".
I've updated the table for the latest results (half the patches from Patch #4 have been committed) and added entries for import times for other text apps.
Wed Oct 8 2003 Update
Robert Wilhelm reworked my UT_StringPtrMap::pick patch to completely remove all use of temp UT_String's.
I had a similar patch on my box. Hats off to him for getting the faster version checked in.
Here's a table that shows the time in seconds for importing the RTF spec: using Abiword, along with each set of patches, plus other text apps:
| App | Import doc (secs) | Import doc + Build doc (secs) |
| Abiword 2.0.0 | 68 | 92 |
| ... + Patch 1 | 16 | 33 |
| ... + Patches 1-2 | 9 | 26 |
| ... + Patches 1-3 | 4 | 23 |
| ... + Patches 1-4 | 4 | 15 |
| ... + Patches 1-5 | 4 | 14 |
| Open Office 1.1 | 4 | 12 |
| Microsoft Word 2000 | 3 | 9 |
| Microsoft Wordpad | 2 | 2 |
With the release of AbiWord 2.4.0, I decided to check the import speed of the latest word processors.
A Toshiba Satellite A70 (3.06GHz Mobile Intel P4) running Windows 2000 SP4 was used for timing.
The Windows Date/Time Control Panel was used for timing and the Mem Usage column of the Windows Task Manager was used for the Memory used data. I minimized and restored the app before timing the RTF spec import.
Things don't look too good for the latest open source word processors.
The latest versions of both OpenOffice and AbiWord are slower than the previous versions.
AbiWord 2.4.0 hangs when loading the RTF spec file. I killed the app after 45 minutes.
Looking at the AbiWord Bugzilla database, one possible bug causing the hang may be Bugzilla Bug 7814 - inconsistent piecetable on RTF import - created 2004-Nov-2.
I added an entry for the time it took AbiWord to load and paginate 98% of the RTF spec before the hang occurs.
| App | Import doc (secs) | Memory used (MB) |
| Microsoft WordPad | 1-2 | 7-8 |
| Microsoft Word 2003 - UI responsive | 3 | 17.4 |
| Microsoft Word 2003 - lazy repagination finished | 10 | 17.4 |
| OpenOffice 1.1.5 | 5-6 | 29.8 |
| AbiWord 2.2.10 | 17-19 | 101-103 |
| OpenOffice 2.0rc | 9-10 | 38-42 |
| AbiWord 2.4.0 | Stopped after 2700secs (45min) | 105 |
| AbiWord 2.4.0 | 64secs @ 98% loaded | 105 |
Abiword 2.4.1 was released. I downloaded it and found the hang still occurred.
Robert Wilhelm entered a bug and an AbiWord dev was able to checkin a fix for the bug.
See commit: Re: CVS: commit msevior abi/src/text/ptbl/xp pt_PT_Append.cpp.
With this fix, AbiWord doesn't hang anymore. Robert was also able to track down a major cause of slowdown from version 2.2.x.
See Re: load time regression: Re: commit: Change Record UUID's.
| App | Import doc (secs) | Memory used (MB) |
| AbiWord 2.4.1 with MSevior's fp_line.cpp checkin | 67 | 96 |
| AbiWord 2.4.1 with MSevior's fp_line.cpp checkin and UT_UUID::_makeUUID changed to counter | 36 | 96 |
I tried out Corel Office 12, namely WordPerfect® 12, to see how fast it could load the RTF 1.5 spec. N.B. The WordPerfect UI looks like it has barely changed since 1995
Final Result: WordPerfect® 12 tries to load the RTF 1.5 spec and then displays a dialog stating "Unknown File Format".
| App | Import doc (secs) | Memory used (MB) |
| WordPerfect® 12 | Fails to load RTF 1.5 spec | N/A |
Abiword 2.4.2 and Corel WordPerfect® Office X3 have been released.
The main slowdown in Abiword has been fixed. WordPerfect® X3 is able to load the RTF 1.5 spec successfully. The UI still looks circa 1995 though.
| App | Import doc (secs) | Memory used (MB) |
| WordPerfect® X3 | 8-11 | 60-70 |
| Abiword 2.4.2 | 30-38 | 105 |
Abiword 2.4.6 and Abiword 2.5.2 have been released.
No change in Abiword 2.4.x branch. Abiword 2.5.x branch seems to be about 10-15% faster. Memory used in 2.5.2 is 40% less than 2.4.6. System used for timing is the same Toshiba laptop as before.
| App | Import doc (secs) | Memory used (MB) |
| Abiword 2.5.2 | 28-30 | 70-77 |
| Abiword 2.4.6 | 33-34 | 128 |
Links:
Problem
Bug 5291 describes slow document loading with a particular RTF file.
The problem seems to be in the table code for handling Attributes and Properties.
Profiling Abiword loading the document in question gives us:
| Module Statistics for abiword.exe | |||
| Time in module: | 105,683.073 millisecond | ||
| Percent of time in module: | 100.0% | ||
| Functions in module: | 11,516 | ||
| Hits in module: | 229,915,486 | ||
| Module function coverage: | 0.1% | ||
| Func Time (ms) | % | Func+Child Time (ms) | % | Hit Count | Function |
| 54,307.062 | 51.4 | 54,307.062 | 51.4 | 42,491,930 | _UT_XML_stricmp |
| 26,297.882 | 24.9 | 103,609.225 | 98.0 | 17,377,706 | PP_AttrProp::isExactMatch |
| 8,060.975 | 7.6 | 8,060.975 | 7.6 | 42,496,802 | UT_String::c_str |
| 6,067.966 | 5.7 | 6,067.966 | 5.7 | 42,496,802 | UT_StringPtrMap::_key |
| 5,294.070 | 5.0 | 5,294.070 | 5.0 | 31,642,386 | UT_StringPtrMap::_next |
| 3,581.270 | 3.4 | 3,581.270 | 3.4 | 10,890,438 | UT_StringPtrMap::_first |
| 2,073.847 | 2.0 | 105,683.073 | 100.0 | 27,492 | pp_TableAttrProp::findMatch |
| 0.000 | 0.0 | 0.000 | 0.0 | 42,491,930 | _UT_stricmp |
From just 27,492 calls to pp_TableAttrProp::findMatch results in 42,491,930 calls to UT_XML_stricmp and others.
UT_XML_stricmp and PP_AttrProp::isExactMatch seem to be the main culprits.
Setup
Windows 2000 SP4. MS VC6 with SP5.
Abiword source code from abiword-2.0.0.tar.gz
Code Investigation
I intermix map and hash in the following description.
In pp_TableAttrProp::findMatch, there's a for-loop iterating over every PP_AttrProp in a vector.
In PP_AttrProp, attributes and properties are stored in separate hashes along with their associated values. These are all strings.
In PP_AttrProp::isExactMatch, there's a test to see that the checksums of the two PP_AttrProp are equal before proceeding,
followed by several other inexpensive checks before actually comparing each Attribute and Property for equality.
PP_AttrProp::isExactMatch goes thru the attribute and property hashes using a cursor.
Looking at the code for UT_StringPtrMap shows that the cursor iterates over the array
for the hash, skipping over empty and deleted slots.
N.B. the comparisons in PP_AttrProp::isExactMatch use a case-insensitive string comparison.
In PP_AttrProp::_computeCheckSum, the checksum is computed only if
there are both attributes and properties, otherwise the checksum is 0.
The checksum consists of the sum of the lengths of each attribute and property and their associated values.
Possible Flaw
________________________________________________________________________________
PP_AttrProp::isExactMatch goes thru the Attribute and Property maps.
It assumes that if the order of items in a map is the same, and the
items in the map are equal (using a case-insensitive comparison), then
the PP_AttrProps are equal.
But the hash function used to put a PP_AttrProp in the map doesn't
handle items in a case-insensitive manner.
Therefore two items with the same text but different case may get placed in different areas of the map. So the order that the cursor returns items may be different based on Attributes/Properties that differ only in case.
However Attributes are converted to lowercase before they're added to the map.
So Properties are the possible problem area, espcially if there are lots of properties in the map.
Proposed Solution
PP_AttrProp::_computeCheckSum to better
differentiate unique PP_AttrProps. also calculate checksum if we have only
attribs and no properties or vice-versa.pp_TableAttrProp directly because the code returns an index of the
AP in the vector to the caller. So, create another vector, same size
as the main AP vector, but this secondary vector is sorted based on
the checksum.strcmp instead of
stricmp when comparing the attribute names.Here's the result of profiling the new version of Abiword loading the RTF doc:
| Module Statistics for abiword.exe | |||
| Time in module: | 944.385 millisecond | ||
| Percent of time in module: | 100.0% | ||
| Functions in module: | 11,528 | ||
| Hits in module: | 2,420,280 | ||
| Module function coverage: | 0.1% | ||
| Func Time (ms) | % | Func+Child Time (ms) | % | Hit Count | Function |
| 325.836 | 34.5 | 325.836 | 34.5 | 270,068 | _UT_XML_stricmp |
| 175.141 | 18.5 | 710.231 | 75.2 | 52,284 | PP_AttrProp::isExactMatch |
| 84.154 | 8.9 | 84.154 | 8.9 | 365,240 | PP_AttrProp::getCheckSum |
| 74.831 | 7.9 | 138.308 | 14.6 | 256,856 | compareAPBinary |
| 66.118 | 7.0 | 66.118 | 7.0 | 342,016 | UT_String::c_str |
| 49.581 | 5.3 | 49.581 | 5.3 | 271,112 | UT_StringPtrMap::_next |
| 46.566 | 4.9 | 46.566 | 4.9 | 342,016 | UT_StringPtrMap::_key |
| 37.726 | 4.0 | 176.034 | 18.6 | 27,492 | UT_Vector::binarysearch |
| 35.200 | 3.7 | 944.385 | 100.0 | 27,492 | pp_TableAttrProp::findMatch |
| 24.010 | 2.5 | 24.010 | 2.5 | 71,948 | _UT_XML_strcmp |
| 22.979 | 2.4 | 22.979 | 2.4 | 106,856 | UT_StringPtrMap::_first |
| 2.243 | 0.2 | 2.243 | 0.2 | 16,832 | PP_AttrProp::getIndex |
| 0.000 | 0.0 | 0.000 | 0.0 | 270,068 | _UT_stricmp |
Just to make sure the checksum calculations are not too expensive, I profiled the old and new Abiwords AP checksum code while loading the RTF doc.
It looks like the new checksum calculation is possibly 20x more epxensive.
But it saves more time in pp_TableAttrProp::findMatch.
| Module Statistics for OLD abiword.exe | |||
| Time in module: | 25.680 millisecond | ||
| Percent of time in module: | 100.0% | ||
| Functions in module: | 11,526 | ||
| Hits in module: | 97,561 | ||
| Module function coverage: | 0.1% | ||
| Func Time (ms) | % | Func+Child Time (ms) | % | Hit Count | Function |
| 14.012 | 54.6 | 25.680 | 100.0 | 22,178 | PP_AttrProp::_computeCheckSum |
| 4.826 | 18.8 | 4.826 | 18.8 | 28,922 | _UT_XML_strlen |
| 2.559 | 10.0 | 2.559 | 10.0 | 14,461 | UT_StringPtrMap::_next |
| 1.871 | 7.3 | 1.871 | 7.3 | 14,461 | UT_String::c_str |
| 1.857 | 7.2 | 1.857 | 7.2 | 14,461 | UT_StringPtrMap::_key |
| 0.556 | 2.2 | 0.556 | 2.2 | 3,078 | UT_StringPtrMap::_first |
| Module Statistics for NEW abiword.exe | |||
| Time in module: | 470.589 millisecond | ||
| Percent of time in module: | 100.0% | ||
| Functions in module: | 11,522 | ||
| Hits in module: | 1,340,335 | ||
| Module function coverage: | 0.1% | ||
| Func Time (ms) | % | Func+Child Time (ms) | % | Hit Count | Function |
| 156.623 | 33.3 | 470.589 | 100.0 | 22,178 | PP_AttrProp::_computeCheckSum |
| 151.201 | 32.1 | 151.201 | 32.1 | 284,783 | _UT_lowerString |
| 47.938 | 10.2 | 47.938 | 10.2 | 289,956 | _UT_XML_strlen |
| 47.305 | 10.1 | 47.305 | 10.1 | 284,783 | _UT_XML_strncpy |
| 25.357 | 5.4 | 25.357 | 5.4 | 144,978 | UT_StringPtrMap::_next |
| 19.502 | 4.1 | 19.502 | 4.1 | 144,978 | UT_String::c_str |
| 18.860 | 4.0 | 18.860 | 4.0 | 144,978 | UT_StringPtrMap::_key |
| 3.802 | 0.8 | 3.802 | 1.2 | 23,701 | UT_StringPtrMap::_first |