Skip to content

Refactoring Algorithms

schauder edited this page Jan 26, 2013 · 4 revisions

One of the ideas for Degraph is to create automatic recommendations for refactorings on package level. This page is for gathering ideas around this topic.

Split of New Package

Problem

The number of packages often grows over time. At some point one wants to move part of the class in the package into a new package. In a large package it might be not obvious how to do that without creating unwanted dependencies. Ideally the new package should be close to half the size of the original package.

Algorithm Description Basic Version

Classes are nodes in a Digraph G.

A package assignment P is a set of disjunct sets of nodes {p1 ... pi} with every node being member of exactly one Package.

There is a second Digraph Gp where each package is a node and an edge (p1 -> p2) iff there exists classes c1 in p1 and c2 in p2 with an edge (c1 -> c2) in G

Assume Gp is free of cycles

If p is one of the packages Split(P,pi) is a new package assignment P': {p1 ... pi', pj} with pi' and pj being a strict subsets of pi and the union of pi' and pj = pi the resulting Digraph Gp' must not have any cycles.

The algorithm should find a Split(P, pi), or all of them or the largest, where the size of a Split is defined as the minimum of the number of classes in pi' and pj

Variation: With Cycles

Gp might not be free of cycles Splits should then not introduce new cycles.

Variation: Multiple Slicing

Consider the Algorithm when there are multiple Package Assignments and the restrictions to the Split should consider all Package Assignments simultaneously.

Questions

Is there a meaningful, fast way for counting cycles? Is there only one, or are there multiple ways?

Is there a technical term of what I call Slicing or Package Assignment.

How to find suitable Splits

Quantifying the Quality of Package Assignment

Problem

Classes are split into packages. But classes also form a Digraph. Just from looking at the Digraph one might come come up with a package structure. The extend to which the actual package assignment and the one generated from the Digraph match might be considered a quality metric. A trivial example for deducing the package structure would be in a graph that contains two complete subgraphs (where every node of a subgraph is connected with each other node of the same subgraph). But between the two subgraphs there is just a single edge. The two subgraphs each form a natural package.

Properties expected from the algorithm

The algorithm should probably have something like a preferred package size. So it avoids overly large packages as well as tiny packages with a single class (or even the pathological case of zero classes in a package)

Sets of classes with lots of edges between them should tend to end up in the same package.

Cycles between packages should rare, i.e. they should only be allowed when packages get huge otherwise.

Clone this wiki locally