Makefile dependency generation for elisp files?

Thu Apr 24 15:29:23 EDT 2008

On Wed, Apr 23, 2008 at 6:53 PM, Stephen J. Turnbull <stephen at xemacs.org>
wrote:

> Robert Widhopf-Fenk writes:
>
>  > GCC allows for the generation of dependencies for Makefiles
>  > by -MD.
>  >
>  > I wonder if there is something similar for Emacs, but I did
>  > not find anything.
>
> No.  elisp is designed to make this difficult and unreliable.
>
> It is possible to construct a graph of requires, but a require that is
> not listed in the package Makefile REQUIRES variable will also be
> reported as a fatal error when attempting to build.
>
> However, Lisp code that calls a function that is not defined at load
> time or compile time is legal by design, so there is no way (except to
> keep a database of all functions ever defined anywhere) to determine
> whether a function that is called is a dependency, and if so, it is
> not possible to determine where it was defined.
>

This goes way beyond what Robert wants, but I've thought about making a
database of all functions for static analysis purposes.  Lisp is a dynamic
system, which makes static analysis difficult and unreliable, as you note.
However, the set of Lisp files distributed with XEmacs plus the packages at
any given point in time comprise a static set that is subject to analysis.
I would like to be able to make certain claims about that set, then append
the caveat, "and what you, the user, do to this system by taking advantage
of Lisp's dynamic nature may render the analysis moot."

Here's the way I would like to construct an analysis engine for Elisp.  The
bottom layer is a database of all functions defined in C, with type
information on the parameters and return values [1] [2].  The next layer up
is an Emacs Lisp reader.  There is one in XEmacs, but XEmacs is not
optimized for this kind of analysis.  It would be slow.  I have looked at
using a Common Lisp engine for the analysis, but Mule-encoded files make
that difficult.  One possibility is to have XEmacs convert all Mule-encoded
files to something else (probably some form of Unicode) off-line, then feed
the converted files to the analysis engine.

Next is the definition database.  Essentially, you read in every single Lisp
file distributed with XEmacs and the packages, and make a list of all
definitions and the files in which they occur.  This is complicated by
conditional definitions (such as defun-when-void) and multiple unconditional
definitions of the same name, which can occur in the packages.  I haven't
thought of a good way of dealing with that problem yet.  Assuming I find a
magic wand to solve it, then next we generate a dependency graph.  Actually,
we need two dependency graphs: a build-time dependency graph and a run-time
dependency graph.  Macros, for example, induce a build-time dependency on
the macro definition, and a run-time dependency on functions and variables
used in the body of the macro.  Some of the dependencies are bogus, because
they stem from GNU Emacs-only code.  So we have to do some constant
folding.  At a minimum, we have to know that (featurep 'xemacs) and
running-xemacs are both the constant t, and we have to be prepared to
perform a string-match on emacs-version.  We then have to be able to
collapse if, when, unless, etc. forms to eliminate code that XEmacs will
never execute.  Note that these two dependency graphs both represent
definitional dependencies, not (necessarily) functional dependencies.

Then I'd like to start processing code for each function to determine type
information.  By making nil be a member of a distinct type, we can do
nil-tracking to see where we can get nils [3] into code that cannot handle
them.  We can also see whether we can provoke type-related throws that are
not caught by anything.

And I don't know what is after that.  Some kind of deeper analysis that I
haven't stopped to dream up yet, because I've already got my head way up in
the clouds with just this much.

Is anybody interested in working on something like this?  If so, I'll bring
my attempts at designing function type representations out into public
instead of trying to suffer through it in silence.

Also, I'm probably going to continue devoting most of my free time over the
next few months to a C analysis engine I'm working on (anybody see a theme
here?).  I don't anticipate having time to seriously pursue this until this
fall ... when I was planning to revive work on voice recognition support in
XEmacs.  I think both projects are fascinating, so I'm willing to have my
priorities swayed by the community.  Plus, I haven't sprung for a good
microphone yet. :-)

Footnotes:
[1] In fact, I started working on this database some months ago, but haven't
gotten very far yet, because I keep finding that my type representation
isn't rich enough.  That means I have to enrich it and start over.  I've
done that 3 times already.  The sticking point has repeatedly been cases
where there are dependencies between the types.  For example, the return
type of #'identity is the type of its argument.  I'm trying to use a type
variable approach, much like Haskell and ML.  There is also the question of
how to maintain the database as the C code is maintained and extended, to
which I do not have a satisfactory answer.
[2] I also need to decide whether to attempt "throw" tracking, much like
exception declarations in Java.  That is, if we track what can be thrown
from a given function, then a later analysis step can decide that a throw is
"bad" (i.e., indicates a program error) only if it reaches top-level without
being caught.  This is uncomputable in the general case.  What does funcall
throw?  Whatever the function it invokes throws.  On the other hand, we are
trying to do the analysis for a closed set of code, so we should be able to
figure this out in many cases.  Where we can't figure it out (e.g., we are
funcalling a user-supplied function), we assume that ANYTHING can be thrown
(and that the return type is "anything").  This is right, because such code
*should* be prepared to handle arbitrary behavior from a user-supplied
function.
[3] Not Oleson.  He's a character from "Little House on the Prairie."
Besides, he spelled his name "Nels", not "Nils" so why did you even think of
him?
-- 
Jerry James
http://loganjerry.googlepages.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xemacs.org/pipermail/xemacs-beta/attachments/20080424/5c7a78dc/attachment.htm