[Bug: 21.5-b28] more detailed gprof profile, performance on aix, cpu hog, tprof, --without-error-checking --without-kkcc

Thomas Mittelstaedt T.Mittelstaedt at cadenas.de
Thu Mar 13 09:55:07 EDT 2008


I don't want to litter this list with profile data, but this one should 
be complete. The previous one
I posted, did not show the full listing.
Hope this helps. The configuration of xemacs was
--without-error-checking' '--without-kkcc'



Thomas Mittelstaedt schrieb:
> Tried out the following:
>
> ./configure  '--without-error-checking' 
> '--with-late-packages=/usr/local/lib/xemacs' '--without-kkcc'
>
> and xemacs still starts using lots of cpu but then goes down to 4-11% 
> range and stays there.
> tprof can also generate micro-profiles of the individual functions. If 
> somebody is interested I can send a tar
> of these profiles.
>
>
>
> Thomas Mittelstaedt schrieb:
>   
>> With xemacs configured without error-checking, I still have 30% + cpu 
>> usage, but interactive usage is
>> significantly better.
>>
>>   Total % For All Processes (/usr/local/bin/xemacs-21.5-b28) = 27,48
>>
>> Subroutine                                            %   
>> Source             
>> ==========                                         ====== 
>> ======             
>> .add_ichar_rune_1                                    1,47 
>> redisplay.c        
>> .extent_list_locate                                  1,22 
>> extents.c          
>> .re_match_2_internal                                 1,21 
>> regex.c            
>> .assq_no_quit                                        1,02 
>> fns.c              
>> .execute_optimized_program                           0,89 
>> bytecode.c         
>> .create_text_block                                   0,76 
>> redisplay.c        
>> .display_table_entry                                 0,75 
>> glyphs.c           
>> .kkcc_marking                                        0,73 
>> gc.c               
>> .BYTE_BUF_BYTE_ADDRESS_NO_VERIFY                     0,53 
>> inline.c           
>> .compare_runes                                       0,53 
>> redisplay-output.c 
>> .Fcons                                               0,42 
>> alloc.c            
>> .noseeum_cons                                        0,35 
>> alloc.c            
>> .kkcc_gc_stack_push_lisp_object_1                    0,35 
>> gc.c               
>> .recompute_funcall_allocation_flag                   0,33 
>> alloc.c            
>> .readchar                                            0,33 
>> lread.c            
>> .Fmemq                                               0,33 
>> fns.c              
>> .generate_fstring_runes                              0,32 
>> redisplay.c        
>> .extent_list_at                                      0,32 
>> extents.c          
>> .specifier_get_inst_list                             0,30 
>> specifier.c        
>> .free_cons                                           0,29 
>> alloc.c            
>> .kkcc_gc_stack_push_1                                0,28 
>> gc.c               
>> .find_symbol_value_1                                 0,28 
>> symbols.c          
>> .recompute_need_to_garbage_collect                   0,26 
>> gc.c               
>> .BYTE_BUF_BYTE_ADDRESS                               0,26 
>> inline.c           
>> .extent_ancestor                                     0,22 
>> inline.c           
>> .store_symval_forwarding                             0,22 
>> symbols.c          
>> .bytebpos_to_charbpos                                0,19 
>> inline.c           
>> .unbind_to_1                                         0,19 
>> eval.c             
>> .Fset                                                0,19 
>> symbols.c          
>> .Ffuncall                                            0,18 
>> eval.c             
>> .extent_fragment_update                              0,18 
>> extents.c          
>> .unbind_to_hairy                                     0,18 
>> eval.c             
>> .do_symval_forwarding                                0,18 
>> symbols.c          
>> .kkcc_bt_push                                        0,18 
>> gc.c               
>> .no_conversion_convert                               0,17 
>> file-coding.c      
>> .alloc_managed_lcrecord                              0,17 
>> alloc.c            
>> .offset_post_relocation                              0,17 
>> regex.c            
>> .free_managed_lcrecord                               0,17 
>> alloc.c            
>> .malloc                                              0,16 
>> gmalloc.c          
>> .lispdesc_indirect_count                             0,16 
>> inline.c           
>> .symbol_is_constant                                  0,15 
>> symbols.c          
>> .add_ichar_rune                                      0,15 
>> redisplay.c        
>> .compare_display_blocks                              0,15 
>> redisplay-output.c 
>> .reject_constant_symbols                             0,15 
>> symbols.c          
>> .specifier_instance_from_inst_list                   0,14 
>> specifier.c        
>> .specifier_instance_1                                0,14 
>> specifier.c        
>>
>> Thomas Mittelstaedt schrieb:
>>   
>>     
>>> The previous output was from an image, which I had instrumented for 
>>> gprof. Here is
>>> the normal one:
>>> User 
>>> Process                                                              % 
>>> =============                                                          
>>> ======
>>> /usr/local/bin/xemacs-21.5-b28.bak                                      
>>> 48,33
>>> /usr/local/bin/bash                                                      
>>> 0,01
>>>
>>>   Profile: /usr/local/bin/xemacs-21.5-b28.bak
>>>
>>>   Total % For All Processes (/usr/local/bin/xemacs-21.5-b28.bak) = 48,33
>>>
>>> Subroutine                                            %   
>>> Source             
>>> ==========                                         ====== 
>>> ======             
>>> .error_check_cons                                    3,09 
>>> inline.c           
>>> .re_match_2_internal                                 2,36 
>>> regex.c            
>>> .extent_list_locate                                  2,23 
>>> extents.c          
>>> .error_check_lstream                                 2,16 
>>> inline.c           
>>> .Dynarr_verify_1                                     1,93 
>>> inline.c           
>>> .error_check_buffer                                  1,25 
>>> inline.c           
>>> .error_check_string                                  0,86 
>>> inline.c           
>>> .kkcc_marking                                        0,82 
>>> gc.c               
>>> .add_ichar_rune_1                                    0,77 
>>> redisplay.c        
>>> .deadbeef_memory                                     0,75 
>>> alloc.c            
>>> .free_cons                                           0,65 
>>> alloc.c            
>>> .readchar                                            0,64 
>>> lread.c            
>>> .assq_no_quit                                        0,63 
>>> fns.c              
>>> .Dynarr_verify_mod_1                                 0,62 
>>> inline.c           
>>> .error_check_image_instance                          0,55 
>>> inline.c           
>>> .execute_optimized_program                           0,55 
>>> bytecode.c         
>>> .error_check_symbol                                  0,54 
>>> inline.c           
>>> .noseeum_cons                                        0,53 
>>> alloc.c            
>>> .BYTE_BUF_BYTE_ADDRESS_NO_VERIFY                     0,52 
>>> inline.c           
>>> .offset_post_relocation                              0,46 
>>> regex.c            
>>> .extent_list_at                                      0,45 
>>> extents.c          
>>> .wrap_record_1                                       0,45 
>>> inline.c           
>>> .BYTE_BUF_BYTE_ADDRESS                               0,45 
>>> inline.c           
>>> .recompute_funcall_allocation_flag                   0,44 
>>> alloc.c            
>>> .mark_object_maybe_checking_free_1                   0,41 
>>> gc.c               
>>> .compare_runes                                       0,40 
>>> redisplay-output.c 
>>> .display_table_entry                                 0,40 
>>> glyphs.c           
>>> .kkcc_gc_stack_push_lisp_object_1                    0,40 
>>> gc.c               
>>> .alloc_managed_lcrecord                              0,37 
>>> alloc.c            
>>> .create_text_block                                   0,36 
>>> redisplay.c        
>>> .Fset                                                0,35 
>>> symbols.c          
>>> .error_check_vector                                  0,31 
>>> inline.c           
>>> .error_check_window                                  0,30 
>>> inline.c           
>>> .generate_fstring_runes                              0,30 
>>> redisplay.c        
>>> .no_conversion_convert                               0,30 
>>> file-coding.c      
>>> .kkcc_gc_stack_push_1                                0,29 
>>> gc.c               
>>> .Fcons                                               0,28 
>>> alloc.c            
>>> .recompute_need_to_garbage_collect                   0,27 
>>> gc.c               
>>> .re_search_2                                         0,26 
>>> regex.c            
>>> .specifier_get_inst_list                             0,26 
>>> specifier.c        
>>> .unbind_to_hairy                                     0,26 
>>> eval.c             
>>> .unbind_to_1                                         0,26 
>>> eval.c             
>>>
>>>
>>>
>>> Thomas Mittelstaedt schrieb:
>>>   
>>>     
>>>       
>>>> I think, I found something interesting using tprof, which I ran like the 
>>>> following:
>>>>
>>>> tprof -us -p xemacs-21.5-b28 -M /localbuild/xemacs-21.5.28-latest/src/ 
>>>> -x /usr/local/bin/xemacs-21.5-b28
>>>>
>>>> User 
>>>> Process                                                              % 
>>>> =============                                                          
>>>> ======
>>>> /usr/local/bin/xemacs-21.5-b28                                          
>>>> 60,65
>>>>
>>>>   Profile: /usr/local/bin/xemacs-21.5-b28
>>>>
>>>>   Total % For All Processes (/usr/local/bin/xemacs-21.5-b28) = 60,65
>>>>
>>>> Subroutine                                            %   
>>>> Source             
>>>> ==========                                         ====== 
>>>> ======             
>>>> .__mcount                                           22,40 
>>>> /lib/libc/__mcount.c
>>>> .qincrement                                          3,02 
>>>> /lib/libc/__mcount.c
>>>> .error_check_lstream                                 2,70 
>>>> inline.c           
>>>> .kkcc_marking                                        1,68 
>>>> gc.c               
>>>> .readchar                                            1,39 
>>>> lread.c            
>>>> .error_check_cons                                    1,21 
>>>> inline.c           
>>>> .extent_list_locate                                  1,10 
>>>> extents.c          
>>>> .error_check_string                                  1,00 
>>>> inline.c           
>>>> .no_conversion_convert                               0,81 
>>>> file-coding.c      
>>>> .__stack_pointer                                     0,79 _stk.s    
>>>>
>>>>
>>>>
>>>>
>>>> Thomas Mittelstaedt schrieb:
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> Using trace and curt, I produced another performance output, which is 
>>>>> attached.
>>>>> Maybe it helps.
>>>>>
>>>>> Thomas Mittelstaedt schrieb:
>>>>>   
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>>> Okay, I have attached the truss output with the start:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Jerry James schrieb:
>>>>>>   
>>>>>>     
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> Thomas,
>>>>>>>
>>>>>>> Thanks for going to the effort of collecting this profiling data.  It
>>>>>>> gives us some interesting information.
>>>>>>>
>>>>>>> 2008/3/11 Thomas Mittelstaedt <T.Mittelstaedt at cadenas.de>:
>>>>>>>   
>>>>>>>     
>>>>>>>       
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>>  ngranularity: Jedes gefundene Beispiel deckt 4 Byte ab. Zeit: 97,87 Sekunden
>>>>>>>>
>>>>>>>>   %   kumulativ   selbst              selbst     gesamt
>>>>>>>>   Zeit  Sekunden   Sekunden    Aufrufe  ms/Aufruf  ms/Aufruf  Name
>>>>>>>>   0,2       0,21     0,21                             .__is_wctype_sb [1]
>>>>>>>>     
>>>>>>>>       
>>>>>>>>         
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> [snip]
>>>>>>>
>>>>>>>   
>>>>>>>     
>>>>>>>       
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>>   0,0       1,00     0,01                             .xtCreate [48]
>>>>>>>>     
>>>>>>>>       
>>>>>>>>         
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> So the top 48 time hogs together consume a grand total of 1.00 seconds
>>>>>>> out of 97.87 seconds of execution.  Your other profile was similar:
>>>>>>> out of 56.55 seconds of execution, the top 43 collectively consumed
>>>>>>> 0.73 seconds of execution.  This indicates that the performance
>>>>>>> problems you are seeing are not occurring in the profiled (i.e.,
>>>>>>> XEmacs) code.  We appear to either be triggering a performance bug in
>>>>>>> some library, or using some library in a suboptimal way.  In either
>>>>>>> case, this is as far as a profiler is going to take us.
>>>>>>>
>>>>>>> I'm still curious about the large number of kioctl calls in the truss
>>>>>>> output you showed us earlier.  Do you know why that output did not
>>>>>>> start with the launching of the XEmacs executable?  Can you get us
>>>>>>> truss output that does start from launch?  I want to know what file
>>>>>>> descriptor is the target of all that manipulation.  It may be a red
>>>>>>> herring, but I think it's worth checking out.
>>>>>>>
>>>>>>> Thank you,
>>>>>>>   
>>>>>>>     
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> XEmacs-Beta mailing list
>>>>>>> XEmacs-Beta at xemacs.org
>>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>>     
>>>>>>>       
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> XEmacs-Beta mailing list
>>>>>>> XEmacs-Beta at xemacs.org
>>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>>         
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> XEmacs-Beta mailing list
>>>>>>> XEmacs-Beta at xemacs.org
>>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>>           
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> XEmacs-Beta mailing list
>>>>>>> XEmacs-Beta at xemacs.org
>>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>>             
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> XEmacs-Beta mailing list
>>>>>>> XEmacs-Beta at xemacs.org
>>>>>>> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>>>>>>>               
>
> _______________________________________________
> XEmacs-Beta mailing list
> XEmacs-Beta at xemacs.org
> http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
>   
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gprof.out.gz
Type: application/x-gzip
Size: 174446 bytes
Desc: not available
Url : http://lists.xemacs.org/pipermail/xemacs-beta/attachments/20080313/c0f3b974/gprof.out-0001.bin


More information about the XEmacs-Beta mailing list