Date: Mon Jan 29 2004

Authour: Johnny Lee

Email: typo_pl@hotmail.com

Summary


I took a look at Abiword bug 5880 and my profiled run of Abiword on this test case produced similar results to the profiled output attached to the bug report.

After much time trying to understand what the code was doing, I devised a solution which reduced the runtime for this operation to <2% of the original runtime!

Diffs can be found here: http://www.geocities.com/typopl/bug5880diff.txt

 

This table shows the time to change the language of all the text in a document from English to Greek using Abiword and using Abiword with my proposed patch.

The time is the average of three runs. The document I used is the document attached to the bug report for bug 5880.

AppOperation time (ms)
Abiword 2.1.08500
... + Patch 1120

Links:

  1. Bugzilla bug entry: Bugzilla Bug 5880 - very slow in setting language on entire document
  2. Start of discussion thread: Are we ready for 2.0.3?
  3. Abiword Piece Table Description: Piece Table Description
  4. Email thread describing my patch: [PATCH] Fix bug 5880 - very slow in setting language on entire document

 

Setup


Windows 2000 SP4. MS VC6 with SP5.

Abiword source code, abiword-2.1.0.tar.gz

 

Problem


Bug 5880 describes an operation in Abiword that takes too long - changing the language of all the text in a document.

The following table shows a profile of Abiword changing the language for a large text selection.

One call to FV_View::setCharFormat proceeds to 512 calls to pt_PieceTable::_fmtChangeSpanWithNotify resulting in a similar number of calls to FV_View::getCharFormat which causes millions of calls to PP_AttrProp::getProperty and PP_AttrProp::getAttribute.

Module Statistics for abiword.exe
  Time in module:60,182.815 millisecond
  Percent of time in module:100.0%
  Functions in module:12,160
  Hits in module:192,006,177
  Module function coverage:6.0%

Func Time (ms)%Func+Child Time (ms)%Hit CountFunction
11,825.01819.619,853.47933.019,791,049UT_StringPtrMap::find_slot
5,715.4439.552,835.40687.84,131,906PP_evalProperty
5,221.6858.725,074.78441.719,790,242UT_StringPtrMap::pick
5,094.7378.55,094.7378.524,375,407s_compare
4,389.4887.34,389.4887.332,369,224UT_String::c_str
3,639.7046.03,639.7046.019,791,049hashcode
3,340.7395.635,380.11558.87,315,538s_evalProperty
2,932.8034.916,925.54328.17,182,371_getStyle
2,907.2294.89,538.79415.814,913,358PP_AttrProp::getAttribute
2,696.3744.514,817.62924.611,865,709PP_AttrProp::getProperty
1,689.2652.813,255.29622.04,544,025PD_Style::getProperty
1,339.0432.254,874.27691.2538FV_View::getCharFormat
1,242.4162.11,949.1313.25,491,292pt_PieceTable::getAttrProp
1,108.0081.87,424.03412.34,544,537PD_Document::getStyle
200.9680.31,179.5012.0546,310PD_Style::getAttribute
161.8150.3358.1960.6396,601pt_PieceTable::getSpanAttrProp
109.9950.2263.8090.4400,445fl_Layout::getAttrProp
99.3280.2457.5250.8396,601PD_Document::getSpanAttrProp
97.9720.2153.8140.3400,445PD_Document::getAttrProp
97.1350.2554.6590.9396,601fl_Layout::getSpanAttrProp
66.5050.166.5050.1410,702fl_ContainerLayout::getNext
3.5440.055,338.69692.0513FV_View::notifyListeners
3.5370.055,111.71391.6770fl_DocListener::change
2.3130.055,088.85291.5512pt_PieceTable::_fmtChangeSpanWithNotify
1.5360.055,174.56891.71pt_PieceTable::_realChangeSpanFmt
1.2290.055,113.45691.6770PD_Document::notifyListeners
0.0300.055,174.59791.71pt_PieceTable::changeSpanFmt
0.0170.060,182.809100.01s_doLangDlg
0.0060.055,728.91092.61FV_View::setCharFormat
0.0020.055,174.59991.71PD_Document::changeSpanFmt

 

Code Investigation


When a user changes the language of a text selection, Abiword calls FV_View::setCharFormat which calls PD_Document::changeSpanFmt which calls the PieceTable code to do the actual work.

For each text fragment in the selection, the PieceTable code calls pt_PieceTable::_fmtChangeSpanWithNotify which does the actual attribute/property change and then tells the Document to notify the document listeners of the format change.

One of the objects listening is the View for the Document. When the View sees a character format change, it updates its character format change state for the current text selection, i.e. the common attributes/properties for the entire text selection.

There's a comment in the relevant code (see Line 1331 of abi\src\text\fmt\xp\fv_View.cpp) which describes how time-consuming this operation can be:

		/*
		  The following brute-force solution works, but is atrociously
		  expensive, so we should avoid using it whenever feasible.

		  TODO: devise special case logic for (at minimum) char motion
		*/

The character-format-change-state-update code does the following:

  1. Put in a list all the attributes and properties for the first text fragment in the selection
  2. Iterate over the remaining text fragments in the text selection, comparing the attributes and properties of the first fragment with the attributes and properties of the other text fragments in the selection. Any attribute or property which have different values between text fragments is removed from the first list. When finished all that remains in the list are attributes and properties that exist in all text fragments in the selection and have the same value in all text fragments.
  3. Compare the list of common attributes and properties with the View's last list of character format changes. If the lists are different, then the new list becomes the current list of character format changes.

For the test case, the text selection covers the entire document, so each text fragment (essentially each line in the test document) causes the code to recalculate the character format since each text fragment's language is modified individually.

512 text fragments result in 512 calculations of the common character format.

That's where all the time is going.

 

Proposed Solution


Here's the result of profiling the new version of Abiword changing the language for the entire document:

Module Statistics for abiword.exe
  Time in module:6,784.927 millisecond
  Percent of time in module:100.0%
  Functions in module:12,160
  Hits in module:3,258,254
  Module function coverage:6.1%

Func Time (ms)%Func+Child Time (ms)%Hit CountFunction
175.3812.6288.1224.2314,473UT_StringPtrMap::find_slot
83.5901.2371.3265.5313,666UT_StringPtrMap::pick
76.0891.1749.40111.054,577PP_evalProperty
72.2161.172.2161.1312,329s_compare
57.7180.957.7180.9314,473hashcode
55.6460.855.6460.8407,484UT_String::c_str
42.6210.6510.0707.586,324s_evalProperty
40.3330.6146.8492.2186,129PP_AttrProp::getAttribute
37.1890.5202.9593.0157,244PP_AttrProp::getProperty
34.5820.5237.7843.584,230_getStyle
25.3790.4171.6632.566,822PD_Style::getProperty
20.7160.332.6140.589,256pt_PieceTable::getAttrProp
16.1980.2110.0861.667,334PD_Document::getStyle
8.7320.1340.2335.027FV_View::getCharFormat
1.4550.0228.1873.4512pt_PieceTable::_fmtChangeSpanWithNotify
1.1910.0199.4072.9512fl_BlockLayout::doclistener_changeSpan
1.0590.0250.6403.7770fl_DocListener::change
0.7070.0294.1594.31pt_PieceTable::_realChangeSpanFmt
0.5100.049.4710.7256fl_BlockLayout::doclistener_changeFmtMark
0.4920.0251.4633.7770PD_Document::notifyListeners
0.1840.0199.5922.9512fl_SectionLayout::bl_doclistener_changeSpan
0.0280.0294.1874.31pt_PieceTable::changeSpanFmt
0.0270.0661.1389.72FV_View::notifyListeners
0.0240.0547.0538.12AV_View::notifyListeners
0.0160.06,784.920100.01s_doLangDlg
0.0100.0226.2533.325FV_View::getCharFormat
0.0080.0985.75114.51FV_View::setCharFormat
0.0030.0305.1894.51FV_View::_generalUpdate
0.0030.0680.39210.01PD_Document::changeSpanFmt
0.0030.00.0030.01PD_Document::deferChanges
0.0030.0386.1985.71fl_DocListener::processDeferredChanges
0.0010.0386.1995.71PD_Document::processDeferredChanges

 

1