Groovy has become a pretty complete package with more than just the language itself, but also with several useful APIs and wrappers to simplify the life of developers, dealing with Ant, Swing, XML, JDBC, command-line, and more. The main groovy-all JAR has reached over 4MB, preventing, for example, usage in an applet, as the time to load the JAR is too long. Recognizing that nobody needs everything in Groovy, it is time we rework the Groovy source organization, deliverables, project build, and more. However, this is a "big bang" kind of change that needs a good level of discipline to get through properly, as several key steps will have to be made.
Having a modular system means that:
A good approach will be to follow some baby steps:
The major options for modular build systems are:
A concensus is emerging that Gradle should be our new build infrastructure. A Gradle replacement for the current Ant build will be created over the next few weeks. This can then be evolved into a multi-project build once the module structure is decided.
We'll also need to ensure that our toolset will properly support a Gradle build infrastructure:
To help with external contributions, and also to ease the work of source code reorganization, it'll be interesting to move to a different source code repository, moving from Subversion to a distributed source control system. Git is the choice made for DVCS at Codehaus, so we'll migrate to Git as part of the process.
Here again, we need to ensure that tool support is okay with Git:
The current SVN structure does not really follow standards. For example, groovy-core, a separately release component (at the moment, hope to change), is located at: http://svn.codehaus.org/groovy/trunk/groovy/groovy-core/
This should really be something like:
The modules listed above would be listed under: http://svn.codehaus.org/groovy/core/trunk/<module>
And they would be release together. Completely optional projects, like the native support, ide support, etc. should be located at: http://svn.codehaus.org/groovy/<project>/trunk
Such a structure will ensure the core modules follow the same lifecycle as groovy-core, whereas external modules cans still be evolved independently, and don't need to be released alongside Groovy releases.
There should perhaps be three levels of module:
Core modules are those needed for the core set of functionality to work. Whether these are shipped as a single jar or a set of jars is up for debate. Everything in this category should be held in the central Git master repository.
Non-core, centrally supported should be in separate Git repositories, not in the Groovy master. They should though be managed from Codehaus in tight-knit collaboration with the master core repository.
Others are modules held by whom and wherever they are held. These might be managed using Bazaar, Mercurial, or Git – there is no need to impose Git as a tool unless they are to migrate into one of the two above centrally managed in which case they must migrate to use Git.
There should be a central index of core modules, non-core centrally managed modules and of any other modules people care to register with the index.
Since we've made Groovy's deliverables be proper OSGi bundles, we need to persue our efforts, and make sure the core modules are also valid OSGi bundles, also marking dependencies between core modules, and external dependencies.
Some modules (swing, sql) also provide their own DefaultGroovyMethods. We'll need to figure out a way for these modules to register their own DGM methods, such as a META-INF/services technique.
An old JIRA issue we may consider: http://jira.codehaus.org/browse/GROOVY-2422
The idea is to provide a "capabilities" mechanism to Groovy, to know which features are supported in the current Groovy version.
Inspired by this issue, we could at least provide some utility class (for instance in GroovySystem) where we could discover the existing modules available on the classpath.
Here's a module structure we could follow. We shouldn't multiplicate the number of modules to a useless high number, but focus on core functionality, and ensure a pure core will be small enough to be easily embeddable and downloadable (for applets, mobile devices such as Android, etc.)
Certain classes would naturally belong to certain modules, but are sometimes used in core, etc.
A good example of this is GroovyTestCase and GroovyShellTestCase. These two classes would naturally go into the test module, but it means the core module would depend on test, but test would already depend on core. So, so far, in the following approach, I've kept these base classes (not very heavyweight anyway) in core, but we'll have to figure out a good way to migrate them in their respective module.
Another example is the java2groovy tool (not very much used, may perhaps be discarded at some point). It belongs to its own module, under some tools meta-module, but the trick with this tool is that it also contributes some batch scripts so one can run the tool from the command-line easily. For the normal distribution, anyway, we can embed those scripts, so that shouldn't be critical.
An important thing to consider is DefaultGroovyMethods. There are already some work done in that area to split methods related to SQL/JDBC and for Swing. So these additional DGM methods will naturally go in their respective modules (ie. sql and swing), but that means we need to provide a mechanism for discovering and registering such methods, for example with some META-INF/services discovery mechanism.
The modules proposed here are more or less coarse-grained modules, in order to keep things simple, and to try to regroup things by big functionality. But some things may be split furthermore. For instance, we could put CliBuilder into its own class, and the same for AntBuilder (although those two are really just one class each!). It may be interesting to do so, also because these classes need additional dependencies (respectively commons-cli and ant). Speaking of splitting those builders into their respective modules, we could even regroup such builders into some meta-builder module.
The groupings below are done per functionality, so even if some utility class like groovy.util.GroovyMBean is in groovy.util, I moved it to the jmx module. Ultimately, it'd certainly be nicer to move such classes into their proper packages, following a deprecation strategy.
I've created a little Groovy script to list / group all the Java and Groovy classes per package. Afterwards, I've moved the packages and some individual classes to form modules according to their respective feature set.