ITK/Backward Compatibility Open Discussion

From KitwarePublic
Jump to navigationJump to search

The topic of backward compatibility has generated animated discussions among users and developers.

We gather here multiple points of view from some of the influential users and developers.

The content of this page is not intended to be conclusive on any of the proposed topics. It simple intends to gather elements of discussion that will serve as food-for-thought.

Main Discussion Topics

Discussion Topics

  1. Deprecation of classes / method. When to deprecate, and how to deprecate.
  2. Drawing the line where fixing a bug will result in backward compatibility breaks.
  3. Cathedral/Bazar: How much burden to put on new contributions to the toolkit.
    1. Is the Insight Journal + Code Review directory a process that is too burdensome.
  4. When can we require additional effort from users to update their code for using new versions of ITK ?
    1. Never
    2. Once a year
    3. Once every five years
  5. How far back in time should we maintain backward compatibility (relates to the previous topic)
    1. Today's version should be compatible with ITK version from N years ago
      1. 1 year ?
      2. 2 years ?
      3. 5 years ?

Some general issues:

  • Who are the customers for ITK?
  • What do they want from ITK?
  • How much of what the customers want is the development community actually able to deliver?
  • Once there's agreement on what ITK is actually committing to deliver, the backward compatibility policy can be implemented to clearly communicate how various components of the code are evolving.

Position Statement 1 (Luis Ibanez)

Topic A

Topic B

Topic C

Position Statement 2 (Bill Lorensen)

These comments are addressing source code backward compatibility.

One of the major criticisms of open-source software is that new revisions are not compatible with old revisions. Breaking compatibility impedes the acceptance and utility of open-source software. On the other hand, strict backward compatibility polices can impede innovation in software. The tension between these two viewpoints is not easily resolved.

As projects mature and the customer base grows, backward compatibility becomes more important. Commercial hardware and software products call this customer base, the installed base. Commercial products usually have a known customer base consisting of those who have purchased or licensed the software. Also, commercial systems seldom expose internal API's. Open source projects rarely know the identities of their customers. And, since the source is open, customers have access to all public and protected classes, methods and data in the code. For open source software, it is almost impossible to determine how the customer base is using the software.

When a project hits a certain point in its life cycle, the unpleasant issue of backward compatibility begins to rear its ugly head. All of a sudden the changes introduced in a new release of the software have a dark side to them; they hold hidden possibilities that will break something one of your users depends on. This is true in both open and closed source projects, but in the open source world it seems that the community has spent less time worrying about it than in the closed source world. From “Preserving Backward Compatibility, http://www.onlamp.com/lpt/a/5626, Garrett Rooney.

The Dark Side of Extreme Programming: The nightly test/build was so effective in empowering programmers to make changes, that API changes occurred too frequently without the necessary buy-in from the user community. From “Insight Insight”, http://www.itk.org/CourseWare/Training/InsideInsight.pdf, Bill Lorensen

Some argue that open source software should be used at your own risk. But even using open source software requires an investment in time, energy and funds. Also the reputation of the development community is at risk.

...consider your user base. If you have only a dozen highly technical users, jumping through hoops to maintain backward compatibility may be more trouble than it's worth. On the other hand, if you have hundreds or thousands of nontechnical users who cannot deal with manual upgrade steps, you need to spend a lot of time worrying about those kinds of issues. Otherwise, the first time you break compatibility you'll easily burn through all the goodwill you built up with your users by providing them with a useful program. It's remarkable how easily people forget the good things from a program as soon as they encounter the first real problem. From “Preserving Backward Compatibility, http://www.onlamp.com/lpt/a/5626, Garrett Rooney.

These investments are made by customers that include developers, users and sponsors.

Topic A

Topic B

Topic C

Position Statement 3 (Steve Pieper)

This is primarily a policy discussion, and so the central issue is how to effectively communicate with users of the toolkit about what they get when they use a particular piece of code. The general policies could be summarized as:

  • Different things should have different names.
  • Similar things should have similar names.
  • If two things have the same name, you can assume they will behave the same.

A way to interpret this is that if you come up with non-backwards compatible version of an algorithm, you should give it a new class name, like MyFilter2, rather than relying on the toolkit version number to indicate that it is different. Deprecation warnings at compile time can inform the developer that MyFilter is out of date. Dropping support for deprecated classes should happen when the toolkit itself gets a new name (like ITK4 instead of ITK3). Developers can choose when to migrate to a new class and/or a new version of the toolkit.

Topic A

Topic B

Topic C

Position Statement 4 (Stephen Aylward)

Backward compatibility is paramount in a toolkit, particularly one that is used by researchers. Being used by researchers means that the toolkit is being used by people who are accustom to creating on their own and who are not patient with outside impediments to their work. If a toolkit continually requires them to re-develop/re-test code that had previously worked, the toolkit will become viewed as an impediment to their work. The researcher will rightly place a high cost on the time spent re-developing, re-testing and the perceived risk that their research is subject to the whim of others. Eventually, that cost will outweigh the benefits, and the user-pool will dwindle.

Backward compatibility applies to the API and the operation of a toolkit. One person's bug is another person's feature. An incorrectly spelled function name is not a bug once that function has been called by a user. Subsequently changing the function's name to the correct spelling does create a bug in the user's code. The same bug/feature dichotomy exists when the API of a set of functions is perceived to be inconsistent. The same bug/feature dichotomy may even exist when a function has side-effects that are perceived as unwanted or even when a function has outputs that are perceived as incorrect. The guiding philosophy should be: Once a function is released and it performs a particular operation, even if the operation it performs is not what was originally intended, its operation cannot be considered a bug. If you want to perform a different operation, you should create a new function, and perhaps begin to deprecate the other function. As it is a general philosophy, there will be times when it does not apply, however, in a strict environment, the sole exception may be when the function name or the function's design specification unambiguously defines the intended operation and yet the function does not provide that operation.

Current developers and users must work to promote the next generation of developers and users. Telling a user that they should not upgrade is an expression of the judgment that the future benefit of the toolkit does not justify the cost of the upgrade. It is saying to the developers that their continued efforts are not likely to be useful to the user. It is saying to the user that they are not welcome as active members in the community. Freezing a toolkit at a particular version is a legitimate project management decision for a user. It should, however, not be the mantra of the toolkit's developers.

Admittedly, radical changes are occasionally needed in a toolkit to keep it current. When making those changes, it is important to apply an otherwise contrary adage: "if you are going to break a standard, then you should REALLY break the standard." That is, the changes to the toolkit should be planned and drastic. Planned means that the changes (1) should be announced and discussed well in-advance of their release; (2) should be well supported, with clear transition paths and documentation; and (3) should be driven by the needs of the community. Drastic means that the changes should be extensive. If the changes being introduced have a subtle appearance then it is likely that they could instead be done in a backward compatible way or as an extension to the existing framework. Breaking backward compatibility should only be allowed if the collective voice of the user community calls for a change that necessitates the complete overhaul of a framework or function to support the trends in research, hardware, or compilers.

Balancing the above issues is best handled by a systematic process for adopting new features, testing backward compatibility, and implementing alternative frameworks. The Insight Toolkit has a well established method for adopting new features, the Insight Journal. The Insight Journal has evolved to be and continues to improve as an effective method for receiving and reviewing candidate methods for inclusion in the Insight Toolkit. Testing backward compatibility is enabled by CTest and coverage counts, but disciplined application of those technologies is and will probably always be a challenge. An established mechanism for implementing, reviewing, and releasing alternative frameworks does not currently exist.

Establishing a mechanism (policy and software) for implementing, reviewing, and releasing alternative frameworks is critical to the continued success of the Insight Toolkit. If such a mechanism could be provided, then daily backward compatibility challenges would have a controlled outlet.




Topic A

Topic B

Topic C

Position Statement 5 (Simon Warfield)

Backward compatibility in the Insight Toolkit is an important issue that must balance the needs of the Insight community for stability, for innovation and for clarity.

In a dynamic open source software environment such as the Insight Toolkit, the source code is an active code base in which the existing code base is being applied to solve image analysis problems. The group of active developers is also dynamic, with new developers joining the community, making use of the code base, and sometimes adding to the code base, and some developers moving away from active utilization of the code base.

When developers in the community find an algorithm they want to use is not implemented in the Insight Toolkit, they can download the source code and create an implementation of that algorithm. New concepts can be added to the toolkit by extending the existing classes or by adding new classes. Testing and evaluation of new implementations can be achieved at the time of implementation and into the future by creating regression tests which are automatically executed, and which verify the operation of the code is as the developer first expected. Developers may choose to contribute code back to the Toolkit.

Code that is present in the Toolkit, and that which is contributed back to the Toolkit creates both an opportunity for and an obligation incumbent upon, the entire Insight community: the opportunity to utilize the new code, and also the burden of ensuring correct operation of the code within the framework of the Toolkit.

Backward compatibility of the evolving Toolkit becomes an issue when additions, modifications or bug fixes create changes in the code which alter the operation of existing code. Purely new additions which simply extend the toolkit rarely create backward compatibility clashes with existing code, but must be considered in light of the obligation to maintain the operation of new code into the future.

The decision of if, when and how to preserve or achieve backward compatibility largely rests in the understanding of the obligation of the developer community to itself, to preserve the operation and application programmer interface of the toolkit.

The nature of the obligation that the developer community creates and assumes upon itself should be clear and carefully documented, so that new users coming to the community are readily aware of the procedures and policy of the community. This discussion can help to set the expectations and understanding of the developer community.

The consequences of different choices of the obligation of backward compatibility, and the perceived importance of these consequences should dictate the nature of the obligation the developer community takes on.

One of the key reasons for the success of open source software has been the credo of `release early, release often'. With this approach, the early deployment of software before it is fully tested and validated, has been found to enable the rapid development of useful and important software. As a side effect of this development, "given enough eyeballs, all bugs are shallow", as the shared expertise and interest of the community enables the rapid discovery and correction of bugs in the code. A consequence of this approach is that software is deployed to the community as it is developed rather than after it is developed. The value of this to the user community is in the rapid response this enables to design and implementation issues, where the dynamic environment of the software bazaar enables those most directly impacted by a change to sort out and resolve the issue, rather than waiting for an answer to be delivered from those in the software `cathedral' [[ http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ ]].

In a young code base undergoing rapid expansion, code can be implemented and regression tested faster than the community can fully appreciate the consequences of the impact of parts of the code base on other parts, and before the interface the code presents can be judged. When this occurs, the interface to functionality erected by a particular implementation, needs to be considered on its merits, just as the implementation of a solid API would be, and not regarded as a sacrosanct interface - being first doesn't mean being best, and shouldn't mean immortality. A regression test will ensure that the implementation achieves what the developer wanted, but doesn't ensure that what the developer wanted was what the community comes to understand is the best strategy to preserve under the obligation of backward compatibility. Indeed, it may be valuable to trial several different interfaces to particular functionality before it becomes clear what will be easiest to use, and what will have the greatest clarity to the largest number of developers.

Backward compatibility enables users of the toolkit to utilize the code base now and into the future with the expectation that software written using the Toolkit will continue to work in the same way in the future. The user community also understands that the code has been released early and often, and can be expected to evolve. At certain milestone occasions, a version of the code is tagged and declared to be a special release - these special releases represent particular stable snapshots of the code base, which have accumulated new functionality, improvements in documentation and efficiency and bug fixes, of such significance that a new version number is provided and the code understood to be especially well suited for wide spread adoption and utilization.

Old versions of the code are not removed and continue to function identically to how they always have, as they are not changed. New versions of the code are modified, with a view to providing the developer community an improved code set.

In general, maintaining compatibility with previously released versions is desirable, because it allows code that utilizes prior releases to adopt new releases easily, with no burden on the developer community, while providing the benefits of new or improved functionality included in the new release. However, an excessive insistence on backward compatibility can hamper innovations, prevent bugs from being fixed, and destroy the aesthetic pleasure of a well-designed application programmer interface. It can fail to provide to the developer community a clear indication of how functionality in the toolkit is intended to be used, by encouraging the development of similar but incompatible implementations of similar functionality.

In particular, when a bug is discovered, a decision to maintain and preserve code that functions incorrectly for the sake of backward compatibility is wrong. Bugs should be fixed, and in cases where external code depends on wrong results from a function to operate correctly, those in the community who chose to adopt new versions or new releases of the code base, will need to be update their code when new code to establish correct operation is implemented. Consider a developer, user or sponsor who has an application that utilizes the Insight Toolkit, but which is impacted by a bug fix. They will have a choice to continue to use the version of the Toolkit that works well for them, or to invest in making changes in order to adopt a new version. They may choose to correct the code that incorrectly depended on wrong results from code in the Toolkit.

A wrong approach would be to decide to impose the burden of maintaining two sets of code, one that works incorrectly and one that works correctly. This distributes the cost of understanding, maintaining, testing and evaluating bad code and good code to the entire developer community, both at the time the decision is made and into the future while ever the bad code is preserved. When new developers come to the code base, they will need to decipher and understand the wrong code and the new code, choose which to use, and the burden of identifying the desirable code to use is a significant one. It can only be offset by a significant investment by the current community in documenting and explaining the bad code and advising all new users to avoid it. In deciding to commit to the support and maintenance of wrong code, the burden on inactive or older users who have decided not to actively support or maintain their own code is reduced, but at the cost of increasing the burden on all future users and developers and the existing active community.

Similarly, code that implements a poor API or which has made wrong design choices, needs to be carefully and thoughtfully eliminated as the toolkit matures, rather than having that code add to the cost and investment that the user community makes in maintaining the toolkit. A simple mechanism to do this is to ensure a level of backwards compatibility will be maintained for particular versions or for a particular time frame, to provide warning of obsolescence through a mechanism of deprecation, and then to remove poor design decisions and wrong APIs at major version upgrades. This provides a balance in effort and cost to the established community, to the currently active developers, and to the future developers of the toolkit.


Topic A

Topic B

Topic C

Background Links