James Dixon’s Blog

James Dixon’s thoughts on commercial open source and open source business intelligence

3. Software Development

leave a comment »

As has been mentioned earlier there are significant differences in the daily tasks and responsibilities of developers under a COSS model.

Forums

Developers need to monitor the community forums and help out where they need to. This does not always mean that developers should jump in and answer every forum post as quickly as possible. When the open source project is new and the community is still building this is necessary as there won’t be any other community members capable of answering the posts. There are also posts that are so involved that it is unlikely that an external community member will be able or likely to answer it. But the community grows the COSS company’s developers should resist the temptation to jump in immediately and allow newer community members the opportunity to help the poster. This is not always easy and has to be a conscious effort at times.

Where ever possible give the community the opportunity to participate.

The COSS company is, after all, giving people the opportunity to work for ‘free’. This is an easy concept in principle but can be hard to implement without complete transparency.
The forums need direct developer involvement because the community includes developers and discussions about lines of code frequently occur.

Visibility

Developers are much more visible to the outside world in an open source project and COSS company than in a proprietary model. They are not hidden away in a campus somewhere and the most presentable of them dusted off once a year for the annual user conference. This visibility is refreshing and motivating.

Consumer Focus

Community

Developers spend considerably more time communicating with other members of the community than they do communicating with customers in a proprietary model. This interaction between engineers and consumers happens daily or hourly instead of once a quarter at a focus group or once a year at a user convention. Over time this changes the way that engineers approach design and implementation as they start to anticipate problems that the consumers may experience. Developers also start to anticipate how easy it will be to explain how to configure or use a feature and adjust their design to make it easier.

This is an important point. This is self-serving from the perspective of the developers because it reduces the community support that they need to provide. This incentive is not nearly as strong under the proprietary model. This incentive helps produce better software for the community. A by-product of this is that the customers, partners, and support and services teams all benefit as well.

Open source and COSS engineers are self-motivated by the community to produce better software.

The majority of participants in the community are technical people. They have an inherent distrust of glossy marketing material. However some of these people still represent potential customers. They are easy to spot on the forums when their posts start with statements like ‘I need help. My boss has given me two days to evaluate this software’. They often use bold and uppercase text to highlight how urgently they would like the community to assist them. Other potential customers are harder to spot. For both cases developers need to know how to asses these situations and react to them.

Customers

Developers are also involved in the creation of the ‘whole product’ in the Go To Market program. For example:

  • Providing materials for training course content and end-user documentation.
  • Providing knowledge transfer to support and services organizations.
  • Helping create demos and samples.

They are also involved in providing third-level (and sometimes second-level) support to customers.

These tasks match very closely to the proprietary model but most of them are never undertaken under the open source model. This is because, for most software engineers, writing software is fun, whereas the tasks needed for a full Go To Market program are much less interesting.

Job Description

You can consider a developer employed by as COSS company in one of two different ways:

  • Traditional developer with forum duties: From this perspective a developer employed by the COSS company writes software, participates in Go To Market and helps out on the forums. The employer happens to have processes/infrastructure in place and an open source business model that puts some or all of the software into open source. In this viewpoint the developers are considered to be in a slightly modified role.
  • Committer with benefits: From this perspective developers are viewed as full-time open source committers that the COSS company happens to employ. In this viewpoint the developers are considered to be in a significantly modified role.

Adoption of the second viewpoint has many implications. In both cases the developer is employed by the COSS company and the COSS company sets the long-term roadmap for the software and the short-term priorities of the engineer. The differences come to light when you look at the infrastructure, processes, and tools that are put in place to support the developers.

If you look at it from the first perspective you have the luxury of putting in place infrastructure that is internal to the organization, processes that include hard-coded paths between roles, and knowledge that is not publicly accessible. In short you have the option to choose how transparent and open you really are.

If you look at it from the second perspective you come up with infrastructure and processes that are more transparent. For example:

  • Source code control system: From the first perspective it is acceptable to have a system that is behind a firewall and that periodically publishes the source code to the outside world. From the second perspective it is better to have a public server that is live for all participants.
  • Feature and defect management: From the first perspective it is beneficial for product manager and development managers to keep engineers focused by only making current work items visible and keeping the rest (lower priority, stretch goals etc.) off the table. From the second perspective it is better to have all the work items visible and clearly categorized and prioritized so that any committer with additional time and energy can help advance the software.
  • Community liaison: From the first perspective the developers are not considered part of the community and so the community liaison role will be solely focused on the external community (and possibly be detrimental to the relationship between the developers and the community). From the second perspective the community liaison views the employed committers as part of the community and so will include them as part of their community activities.
  • Participating in other projects: Sometimes, in order to complete a feature or fix a defect, a developer needs to contribute a change or fix to another open source project. From the first perspective this is an item of work that is necessary but not easy to categorize. From the second perspective this item of work is 100% in line with the job description

This difference becomes obvious when partners or customers want to contribute or sponsor committers to the project. How do they know what would help improve the software the most? How does the COSS company handle those contributions? Whether or not the COSS company needs to put in place new processes or procedures or infrastructure depends on which philosophy has been followed.

Dual Focus

Developers are one of the few groups within the COSS company that are dual focused. Developers cannot be focused solely on the community because the creation of whole product is complex, resource intensive, and very ineffective without development involvement. This can lead to conflict if the market-focused groups within the COSS company do not appreciate the extent to which the role of development is different under the COSS model.

Development Traction

Raymond makes the statement ‘Good programmers know what to write. Great ones know what to rewrite (and reuse)’. This is interesting and applicable on several levels. The architecture of the Pentaho platform is not a rewrite of the BI products we did in our other start-ups. Its primary design goals are based on the lessons we learned over 15 years of implementing our own, and other people’s, BI products for customers. We started out with many years of good and bad experiences and used the professional open source model to add the ability to build upon a base of open source components. These are both significant factors.

This talks to a point Alfred Brooks makes in ‘The Mythical Man Month’: ‘Plan to throw one away; you will anyhow’. The founders of a COSS company typically have ‘thrown one away’ (the previous proprietary software they created) before they even start, sometimes more than one.

Another factor is that being an open source company we cannot embed licensed components into our BI platform. Since we make it available for free we cannot incur any third-party costs for each download. When we need a new component, whether it is an embedded database, rules engine, scheduler, work-flow engine we must either write it ourselves or use an existing open source component. There are over 100,000 open source projects on SourceForge.net alone and then there is the Apache Foundation, and Freshmeat, and ObjectWeb etc. We have a lot of choice when it comes to components that we can use. My experience is that when we need an infrastructure or middle-ware component (that is not specific to our domain) there are, without fail, viable open source offerings. When we need a component that is BI-specific we usually need to write it ourselves.

We end up creating (and owning the copyright to) only the functionality that is unique to our domain, and for everything else we make use of existing open source components. This is a very efficient development model. Eric Raymond talks about this principle at length in his essay ‘Homesteading the Noosphere’ in ‘Cathedral and the Bazaar’. The outcome is that we own the copyright to the core functionality of our platform and we do not need to own the copyright to an infrastructure utility such as a scheduler. This gives us enormous traction when it comes to development of functionality.

Proprietary software vendors are usually under immense pressure to get a product to market quickly. In order to reduce time-to-market they have two choices for functionality that is not core to the product: use open source components which are financially free but that can be politically costly, or use commercial components that come with a financial cost. Either way the don’t own the intellectual property. The embedding of open source components by proprietary vendors leads to an interesting positioning problem for them (see the Marketing section below).

Integration Effort

Another factor in traction is the ‘mergers and acquisitions’ factor. When a proprietary company needs to merge or integrate products together it is either the result of a merger or acquisition or as a result of licensing another proprietary product to fill a significant functionality gap. In most cases the two products involved are ‘whole products’. This makes the integration significantly harder. Integrating the features is not hard if there is only minor functionality overlap between the two products and this is typically the case. However there is often significant overlap when it comes to authentication, authorization, metadata, repositories, installation, administration etc. These features are part of the ‘whole product’ aspect of the software.

This makes sense if you look at the things you have to add to a typical piece of software to make it into a product. If you start with two pieces of software, A and B, that have both been turned into ‘whole product’ by two different vendors, you now have two products, Y and Z, that have different features but that have both been packaged with a similar (but not always compatible) set of technologies designed to make the products easier to deploy and maintain. It is the integration and the migration of these two products into a single one that is much harder than the integration of the core software. That is to say it is typically much easier to integrate software A and B together than to integrate products Y and Z together. What can be even harder is the migration of the install-base from the individual products to the combined one.

In the COSS model things are different because we are embedding and integrating with open source components. We are typically dealing with components that were designed to be embedded. They were designed with interfaces that support the applicable standards. In open source standards are not supported for marketing or check-in-the-box reasons. They are supported because they define boundaries in the technical landscape that enable open source developers to focus on the core functionality of their project. There is no point wasting time trying to create a schema for storing metadata models when the Common Warehouse Metamodel exists. Not only would it be a waste of time recreating the work already accomplished it would be detrimental because the effect of it becoming popular would be a dilution and fragmentation of the metadata community.

Recruitment

One advantage for the COSS companies is that they can hire from their own community. There are several advantages to this:

  • The COSS company can monitor the performance and quality of community members over time before making any kind of offer. The COSS company can determine that a developer is capable of understanding the software and can contribute to it within the guidelines required.
  • The COSS company has a significant pool of interested developers at all times.
  • Once hired the new employee needs little in the way of training as they have been working with the software for some time.

This recruitment advantage applies to many departments within the COSS company, not just development. This advantage also extends to the customers of the COSS company as well: for example because of MySQL’s open source model there are probably many times more people familiar with the MySQL database than are familiar with Oracle’s database.

COSS engineers needs to be comfortable with having their work peer reviewed (sometimes quite critically) by anyone that wants to. They also need to be comfortable with the commercialization of open source via a professional open source model. Not all engineers fit both of these profiles.

Community Web Site

The COSS company needs to provide a web site for the community to use. This web site is sometimes call the ‘Dev Zone’, ‘Community’ site, ‘.org’ site, or the ‘forge’.

This web site is where the principles of open source are most clearly visible.

  • Availability of design, source code, binaries and documentation (openness and ‘early and often’)
  • Forums for interaction between community members (community).
  • Communication from the administrators (transparency).
  • Public roadmap that is based on community feedback (openness).
  • Feature and defect tracking (transparency).
  • Ways of participating in the project (openness).

In order to provide these capabilities these web sites use forum software, a download site, a wiki for documentation, and a feature and defect tracking tool. These features make up the ‘project’.

Written by James

May 29, 2009 at 9:35 pm

Leave a comment