[Bug: 21.5-b28] performance on aix, cpu hog, tprof, --without-error-checking --without-kkcc

Thomas Mittelstaedt T.Mittelstaedt at cadenas.de
Thu Mar 13 06:25:21 EDT 2008


Tried out the following:

./configure  '--without-error-checking' 
'--with-late-packages=/usr/local/lib/xemacs' '--without-kkcc'

and xemacs still starts using lots of cpu but then goes down to 4-11% 
range and stays there.
tprof can also generate micro-profiles of the individual functions. If 
somebody is interested I can send a tar
of these profiles.



Thomas Mittelstaedt schrieb:
> With xemacs configured without error-checking, I still have 30% + cpu 
> usage, but interactive usage is
> significantly better.
>
>   Total % For All Processes (/usr/local/bin/xemacs-21.5-b28) = 27,48
>
> Subroutine                                            %   
> Source             
> ==========                                         ====== 
> ======             
> .add_ichar_rune_1                                    1,47 
> redisplay.c        
> .extent_list_locate                                  1,22 
> extents.c          
> .re_match_2_internal                                 1,21 
> regex.c            
> .assq_no_quit                                        1,02 
> fns.c              
> .execute_optimized_program                           0,89 
> bytecode.c         
> .create_text_block                                   0,76 
> redisplay.c        
> .display_table_entry                                 0,75 
> glyphs.c           
> .kkcc_marking                                        0,73 
> gc.c               
> .BYTE_BUF_BYTE_ADDRESS_NO_VERIFY                     0,53 
> inline.c           
> .compare_runes                                       0,53 
> redisplay-output.c 
> .Fcons                                               0,42 
> alloc.c            
> .noseeum_cons                                        0,35 
> alloc.c            
> .kkcc_gc_stack_push_lisp_object_1                    0,35 
> gc.c               
> .recompute_funcall_allocation_flag                   0,33 
> alloc.c            
> .readchar                                            0,33 
> lread.c            
> .Fmemq                                               0,33 
> fns.c              
> .generate_fstring_runes                              0,32 
> redisplay.c        
> .extent_list_at                                      0,32 
> extents.c          
> .specifier_get_inst_list                             0,30 
> specifier.c        
> .free_cons                                           0,29 
> alloc.c            
> .kkcc_gc_stack_push_1                                0,28 
> gc.c               
> .find_symbol_value_1                                 0,28 
> symbols.c          
> .recompute_need_to_garbage_collect                   0,26 
> gc.c               
> .BYTE_BUF_BYTE_ADDRESS                               0,26 
> inline.c           
> .extent_ancestor                                     0,22 
> inline.c           
> .store_symval_forwarding                             0,22 
> symbols.c          
> .bytebpos_to_charbpos                                0,19 
> inline.c           
> .unbind_to_1                                         0,19 
> eval.c             
> .Fset                                                0,19 
> symbols.c          
> .Ffuncall                                            0,18 
> eval.c             
> .extent_fragment_update                              0,18 
> extents.c          
> .unbind_to_hairy                                     0,18 
> eval.c             
> .do_symval_forwarding                                0,18 
> symbols.c          
> .kkcc_bt_push                                        0,18 
> gc.c               
> .no_conversion_convert                               0,17 
> file-coding.c      
> .alloc_managed_lcrecord                              0,17 
> alloc.c            
> .offset_post_relocation                              0,17 
> regex.c            
> .free_managed_lcrecord                               0,17 
> alloc.c            
> .malloc                                              0,16 
> gmalloc.c          
> .lispdesc_indirect_count                             0,16 
> inline.c           
> .symbol_is_constant                                  0,15 
> symbols.c          
> .add_ichar_rune                                      0,15 
> redisplay.c        
> .compare_display_blocks                              0,15 
> redisplay-output.c 
> .reject_constant_symbols                             0,15 
> symbols.c          
> .specifier_instance_from_inst_list                   0,14 
> specifier.c        
> .specifier_instance_1                                0,14 
> specifier.c        
>
> Thomas Mittelstaedt schrieb:
>   
>> The previous output was from an image, which I had instrumented for 
>> gprof. Here is
>> the normal one:
>> User 
>> Process                                                              % 
>> =============                                                          
>> ======
>> /usr/local/bin/xemacs-21.5-b28.bak                                      
>> 48,33
>> /usr/local/bin/bash                                                      
>> 0,01
>>
>>   Profile: /usr/local/bin/xemacs-21.5-b28.bak
>>
>>   Total % For All Processes (/usr/local/bin/xemacs-21.5-b28.bak) = 48,33
>>
>> Subroutine                                            %   
>> Source             
>> ==========                                         ====== 
>> ======             
>> .error_check_cons                                    3,09 
>> inline.c           
>> .re_match_2_internal                                 2,36 
>> regex.c            
>> .extent_list_locate                                  2,23 
>> extents.c          
>> .error_check_lstream                                 2,16 
>> inline.c           
>> .Dynarr_verify_1                                     1,93 
>> inline.c           
>> .error_check_buffer                                  1,25 
>> inline.c           
>> .error_check_string                                  0,86 
>> inline.c           
>> .kkcc_marking                                        0,82 
>> gc.c               
>> .add_ichar_rune_1                                    0,77 
>> redisplay.c        
>> .deadbeef_memory                                     0,75 
>> alloc.c            
>> .free_cons                                           0,65 
>> alloc.c            
>> .readchar                                            0,64 
>> lread.c            
>> .assq_no_quit                                        0,63 
>> fns.c              
>> .Dynarr_verify_mod_1                                 0,62 
>> inline.c           
>> .error_check_image_instance                          0,55 
>> inline.c           
>> .execute_optimized_program                           0,55 
>> bytecode.c         
>> .error_check_symbol                                  0,54 
>> inline.c           
>> .noseeum_cons                                        0,53 
>> alloc.c            
>> .BYTE_BUF_BYTE_ADDRESS_NO_VERIFY                     0,52 
>> inline.c           
>> .offset_post_relocation                              0,46 
>> regex.c            
>> .extent_list_at                                      0,45 
>> extents.c          
>> .wrap_record_1                                       0,45 
>> inline.c           
>> .BYTE_BUF_BYTE_ADDRESS                               0,45 
>> inline.c           
>> .recompute_funcall_allocation_flag                   0,44 
>> alloc.c            
>> .mark_object_maybe_checking_free_1                   0,41 
>> gc.c               
>> .compare_runes                                       0,40 
>> redisplay-output.c 
>> .display_table_entry                                 0,40 
>> glyphs.c           
>> .kkcc_gc_stack_push_lisp_object_1                    0,40 
>> gc.c               
>> .alloc_managed_lcrecord                              0,37 
>> alloc.c            
>> .create_text_block                                   0,36 
>> redisplay.c        
>> .Fset                                                0,35 
>> symbols.c          
>> .error_check_vector                                  0,31 
>> inline.c           
>> .error_check_window                                  0,30 
>> inline.c           
>> .generate_fstring_runes                              0,30 
>> redisplay.c        
>> .no_conversion_convert                               0,30 
>> file-coding.c      
>> .kkcc_gc_stack_push_1                                0,29 
>> gc.c               
>> .Fcons                                               0,28 
>> alloc.c            
>> .recompute_need_to_garbage_collect                   0,27 
>> gc.c               
>> .re_search_2                                         0,26 
>> regex.c            
>> .specifier_get_inst_list                             0,26 
>> specifier.c        
>> .unbind_to_hairy                                     0,26 
>> eval.c             
>> .unbind_to_1                                         0,26 
>> eval.c             
>>
>>
>>
>> Thomas Mittelstaedt schrieb:
>>   
>>     
>>> I think, I found something interesting using tprof, which I ran like the 
>>> following:
>>>
>>> tprof -us -p xemacs-21.5-b28 -M /localbuild/xemacs-21.5.28-latest/src/ 
>>> -x /usr/local/bin/xemacs-21.5-b28
>>>
>>> User 
>>> Process                                                              % 
>>> =============                                                          
>>> ======
>>> /usr/local/bin/xemacs-21.5-b28                                          
>>> 60,65
>>>
>>>   Profile: /usr/local/bin/xemacs-21.5-b28
>>>
>>>   Total % For All Processes (/usr/local/bin/xemacs-21.5-b28) = 60,65
>>>
>>> Subroutine                                            %   
>>> Source             
>>> ==========                                         ====== 
>>> ======             
>>> .__mcount                                           22,40 
>>> /lib/libc/__mcount.c
>>> .qincrement                                          3,02 
>>> /lib/libc/__mcount.c
>>> .error_check_lstream                                 2,70 
>>> inline.c           
>>> .kkcc_marking                                        1,68 
>>> gc.c               
>>> .readchar                                            1,39 
>>> lread.c            
>>> .error_check_cons                                    1,21 
>>> inline.c           
>>> .extent_list_locate                                  1,10 
>>> extents.c          
>>> .error_check_string                                  1,00 
>>> inline.c           
>>> .no_conversion_convert                               0,81 
>>> file-coding.c      
>>> .__stack_pointer                                     0,79 _stk.s    
>>>
>>>
>>>
>>>
>>> Thomas Mittelstaedt schrieb:
>>>   
>>>     
>>>       
>>>> Using trace and curt, I produced another performance output, which is 
>>>> attached.
>>>> Maybe it helps.
>>>>
>>>> Thomas Mittelstaedt schrieb:
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> Okay, I have attached the truss output with the start:
>>>>>
>>>>>
>>>>>
>>>>> Jerry James schrieb:
>>>>>   
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>>> Thomas,
>>>>>>
>>>>>> Thanks for going to the effort of collecting this profiling data.  It
>>>>>> gives us some interesting information.
>>>>>>
>>>>>> 2008/3/11 Thomas Mittelstaedt <T.Mittelstaedt at cadenas.de>:
>>>>>>   
>>>>>>     
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>  ngranularity: Jedes gefundene Beispiel deckt 4 Byte ab. Zeit: 97,87 Sekunden
>>>>>>>
>>>>>>>   %   kumulativ   selbst              selbst     gesamt
>>>>>>>   Zeit  Sekunden   Sekunden    Aufrufe  ms/Aufruf  ms/Aufruf  Name
>>>>>>>   0,2       0,21     0,21                             .__is_wctype_sb [1]
>>>>>>>     
>>>>>>>       
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>> [snip]
>>>>>>
>>>>>>   
>>>>>>     
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>   0,0       1,00     0,01                             .xtCreate [48]
>>>>>>>     
>>>>>>>       
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>> So the top 48 time hogs together consume a grand total of 1.00 seconds
>>>>>> out of 97.87 seconds of execution.  Your other profile was similar:
>>>>>> out of 56.55 seconds of execution, the top 43 collectively consumed
>>>>>> 0.73 seconds of execution.  This indicates that the performance
>>>>>> problems you are seeing are not occurring in the profiled (i.e.,
>>>>>> XEmacs) code.  We appear to either be triggering a performance bug in
>>>>>> some library, or using some library in a suboptimal way.  In either
>>>>>> case, this is as far as a profiler is going to take us.
>>>>>>
>>>>>> I'm still curious about the large number of kioctl calls in the truss
>>>>>> output you showed us earlier.  Do you know why that output did not
>>>>>> start with the launching of the XEmacs executable?  Can you get us
>>>>>> truss output that does start from launch?  I want to know what file
>>>>>> descriptor is the target of all that manipulation.  It may be a red
>>>>>> herring, but I think it's worth checking out.
>>>>>>
>>>>>> Thank you,
>>>>>>   
>>>>>>     
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> XEmacs-Beta mailing list
>>>>>> XEmacs-Beta at xemacs.org
>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>     
>>>>>>       
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> XEmacs-Beta mailing list
>>>>>> XEmacs-Beta at xemacs.org
>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>         
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> XEmacs-Beta mailing list
>>>>>> XEmacs-Beta at xemacs.org
>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>           
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> XEmacs-Beta mailing list
>>>>>> XEmacs-Beta at xemacs.org
>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>             
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> XEmacs-Beta mailing list
>>>>>> XEmacs-Beta at xemacs.org
>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta



More information about the XEmacs-Beta mailing list