Date: Mon Jan 29 2004
Authour: Johnny Lee
Email: typo_pl@hotmail.com
Summary
I took a look at Abiword bug 5880 and my profiled run of Abiword on this test case produced similar results to the profiled output attached to the bug report.
After much time trying to understand what the code was doing, I devised a solution which reduced the runtime for this operation to <2% of the original runtime!
Diffs can be found here: http://www.geocities.com/typopl/bug5880diff.txt
This table shows the time to change the language of all the text in a document from English to Greek using Abiword and using Abiword with my proposed patch.
The time is the average of three runs. The document I used is the document attached to the bug report for bug 5880.
| App | Operation time (ms) |
| Abiword 2.1.0 | 8500 |
| ... + Patch 1 | 120 |
Links:
Setup
Windows 2000 SP4. MS VC6 with SP5.
Abiword source code, abiword-2.1.0.tar.gz
Problem
Bug 5880 describes an operation in Abiword that takes too long - changing the language of all the text in a document.
The following table shows a profile of Abiword changing the language for a large text selection.
One call to FV_View::setCharFormat proceeds to 512 calls to
pt_PieceTable::_fmtChangeSpanWithNotify resulting in a similar number of calls to
FV_View::getCharFormat which causes millions of calls to
PP_AttrProp::getProperty and PP_AttrProp::getAttribute.
| Module Statistics for abiword.exe | |||
| Time in module: | 60,182.815 millisecond | ||
| Percent of time in module: | 100.0% | ||
| Functions in module: | 12,160 | ||
| Hits in module: | 192,006,177 | ||
| Module function coverage: | 6.0% | ||
| Func Time (ms) | % | Func+Child Time (ms) | % | Hit Count | Function |
| 11,825.018 | 19.6 | 19,853.479 | 33.0 | 19,791,049 | UT_StringPtrMap::find_slot |
| 5,715.443 | 9.5 | 52,835.406 | 87.8 | 4,131,906 | PP_evalProperty |
| 5,221.685 | 8.7 | 25,074.784 | 41.7 | 19,790,242 | UT_StringPtrMap::pick |
| 5,094.737 | 8.5 | 5,094.737 | 8.5 | 24,375,407 | s_compare |
| 4,389.488 | 7.3 | 4,389.488 | 7.3 | 32,369,224 | UT_String::c_str |
| 3,639.704 | 6.0 | 3,639.704 | 6.0 | 19,791,049 | hashcode |
| 3,340.739 | 5.6 | 35,380.115 | 58.8 | 7,315,538 | s_evalProperty |
| 2,932.803 | 4.9 | 16,925.543 | 28.1 | 7,182,371 | _getStyle |
| 2,907.229 | 4.8 | 9,538.794 | 15.8 | 14,913,358 | PP_AttrProp::getAttribute |
| 2,696.374 | 4.5 | 14,817.629 | 24.6 | 11,865,709 | PP_AttrProp::getProperty |
| 1,689.265 | 2.8 | 13,255.296 | 22.0 | 4,544,025 | PD_Style::getProperty |
| 1,339.043 | 2.2 | 54,874.276 | 91.2 | 538 | FV_View::getCharFormat |
| 1,242.416 | 2.1 | 1,949.131 | 3.2 | 5,491,292 | pt_PieceTable::getAttrProp |
| 1,108.008 | 1.8 | 7,424.034 | 12.3 | 4,544,537 | PD_Document::getStyle |
| 200.968 | 0.3 | 1,179.501 | 2.0 | 546,310 | PD_Style::getAttribute |
| 161.815 | 0.3 | 358.196 | 0.6 | 396,601 | pt_PieceTable::getSpanAttrProp |
| 109.995 | 0.2 | 263.809 | 0.4 | 400,445 | fl_Layout::getAttrProp |
| 99.328 | 0.2 | 457.525 | 0.8 | 396,601 | PD_Document::getSpanAttrProp |
| 97.972 | 0.2 | 153.814 | 0.3 | 400,445 | PD_Document::getAttrProp |
| 97.135 | 0.2 | 554.659 | 0.9 | 396,601 | fl_Layout::getSpanAttrProp |
| 66.505 | 0.1 | 66.505 | 0.1 | 410,702 | fl_ContainerLayout::getNext |
| 3.544 | 0.0 | 55,338.696 | 92.0 | 513 | FV_View::notifyListeners |
| 3.537 | 0.0 | 55,111.713 | 91.6 | 770 | fl_DocListener::change |
| 2.313 | 0.0 | 55,088.852 | 91.5 | 512 | pt_PieceTable::_fmtChangeSpanWithNotify |
| 1.536 | 0.0 | 55,174.568 | 91.7 | 1 | pt_PieceTable::_realChangeSpanFmt |
| 1.229 | 0.0 | 55,113.456 | 91.6 | 770 | PD_Document::notifyListeners |
| 0.030 | 0.0 | 55,174.597 | 91.7 | 1 | pt_PieceTable::changeSpanFmt |
| 0.017 | 0.0 | 60,182.809 | 100.0 | 1 | s_doLangDlg |
| 0.006 | 0.0 | 55,728.910 | 92.6 | 1 | FV_View::setCharFormat |
| 0.002 | 0.0 | 55,174.599 | 91.7 | 1 | PD_Document::changeSpanFmt |
Code Investigation
When a user changes the language of a text selection, Abiword calls
FV_View::setCharFormat which calls PD_Document::changeSpanFmt
which calls the PieceTable code to do the actual work.
For each text fragment in the selection, the PieceTable code calls
pt_PieceTable::_fmtChangeSpanWithNotify which
does the actual attribute/property change and then tells the
Document to notify the document listeners of the format change.
One of the objects listening is the View for the Document. When the View sees a character format change, it updates its character format change state for the current text selection, i.e. the common attributes/properties for the entire text selection.
There's a comment in the relevant code (see Line 1331 of abi\src\text\fmt\xp\fv_View.cpp) which describes how time-consuming this operation can be:
/* The following brute-force solution works, but is atrociously expensive, so we should avoid using it whenever feasible. TODO: devise special case logic for (at minimum) char motion */
The character-format-change-state-update code does the following:
For the test case, the text selection covers the entire document, so each text fragment (essentially each line in the test document) causes the code to recalculate the character format since each text fragment's language is modified individually.
512 text fragments result in 512 calculations of the common character format.
That's where all the time is going.
Proposed Solution
Here's the result of profiling the new version of Abiword changing the language for the entire document:
| Module Statistics for abiword.exe | |||
| Time in module: | 6,784.927 millisecond | ||
| Percent of time in module: | 100.0% | ||
| Functions in module: | 12,160 | ||
| Hits in module: | 3,258,254 | ||
| Module function coverage: | 6.1% | ||
| Func Time (ms) | % | Func+Child Time (ms) | % | Hit Count | Function |
| 175.381 | 2.6 | 288.122 | 4.2 | 314,473 | UT_StringPtrMap::find_slot |
| 83.590 | 1.2 | 371.326 | 5.5 | 313,666 | UT_StringPtrMap::pick |
| 76.089 | 1.1 | 749.401 | 11.0 | 54,577 | PP_evalProperty |
| 72.216 | 1.1 | 72.216 | 1.1 | 312,329 | s_compare |
| 57.718 | 0.9 | 57.718 | 0.9 | 314,473 | hashcode |
| 55.646 | 0.8 | 55.646 | 0.8 | 407,484 | UT_String::c_str |
| 42.621 | 0.6 | 510.070 | 7.5 | 86,324 | s_evalProperty |
| 40.333 | 0.6 | 146.849 | 2.2 | 186,129 | PP_AttrProp::getAttribute |
| 37.189 | 0.5 | 202.959 | 3.0 | 157,244 | PP_AttrProp::getProperty |
| 34.582 | 0.5 | 237.784 | 3.5 | 84,230 | _getStyle |
| 25.379 | 0.4 | 171.663 | 2.5 | 66,822 | PD_Style::getProperty |
| 20.716 | 0.3 | 32.614 | 0.5 | 89,256 | pt_PieceTable::getAttrProp |
| 16.198 | 0.2 | 110.086 | 1.6 | 67,334 | PD_Document::getStyle |
| 8.732 | 0.1 | 340.233 | 5.0 | 27 | FV_View::getCharFormat |
| 1.455 | 0.0 | 228.187 | 3.4 | 512 | pt_PieceTable::_fmtChangeSpanWithNotify |
| 1.191 | 0.0 | 199.407 | 2.9 | 512 | fl_BlockLayout::doclistener_changeSpan |
| 1.059 | 0.0 | 250.640 | 3.7 | 770 | fl_DocListener::change |
| 0.707 | 0.0 | 294.159 | 4.3 | 1 | pt_PieceTable::_realChangeSpanFmt |
| 0.510 | 0.0 | 49.471 | 0.7 | 256 | fl_BlockLayout::doclistener_changeFmtMark |
| 0.492 | 0.0 | 251.463 | 3.7 | 770 | PD_Document::notifyListeners |
| 0.184 | 0.0 | 199.592 | 2.9 | 512 | fl_SectionLayout::bl_doclistener_changeSpan |
| 0.028 | 0.0 | 294.187 | 4.3 | 1 | pt_PieceTable::changeSpanFmt |
| 0.027 | 0.0 | 661.138 | 9.7 | 2 | FV_View::notifyListeners |
| 0.024 | 0.0 | 547.053 | 8.1 | 2 | AV_View::notifyListeners |
| 0.016 | 0.0 | 6,784.920 | 100.0 | 1 | s_doLangDlg |
| 0.010 | 0.0 | 226.253 | 3.3 | 25 | FV_View::getCharFormat |
| 0.008 | 0.0 | 985.751 | 14.5 | 1 | FV_View::setCharFormat |
| 0.003 | 0.0 | 305.189 | 4.5 | 1 | FV_View::_generalUpdate |
| 0.003 | 0.0 | 680.392 | 10.0 | 1 | PD_Document::changeSpanFmt |
| 0.003 | 0.0 | 0.003 | 0.0 | 1 | PD_Document::deferChanges |
| 0.003 | 0.0 | 386.198 | 5.7 | 1 | fl_DocListener::processDeferredChanges |
| 0.001 | 0.0 | 386.199 | 5.7 | 1 | PD_Document::processDeferredChanges |