James Dixon’s Blog

James Dixon’s thoughts on commercial open source and open source business intelligence

The Software Paradox by Stephen O’Grady

leave a comment »

O’Reilly has released a free 60 page ebook by Stephen O’Grady called The Software Paradox – The Rise and Fall of the Software Industry. You can access it here.

So what is “The Software Paradox” according to O’Grady? The basic idea is that while software is becoming increasingly vital to businesses, software that used to generate billions in revenue is often now available as a free download. As the author says:

This is the Software Paradox: the most powerful disruptor we have ever seen and the creator of multibillion-dollar net new markets is being commercially de- valued, daily. Just as the technology industry was firmly convinced in 1981 that the money was in hardware, not software, the industry today is largely built on the assumption that the real revenue is in software. The evidence, however, suggests that software is less valuable—in the commercial sense—than many are aware, and becoming less so by the day. And that trend is, in all likelihood, not reversible. The question facing an entire industry, then, is what next?

The ebook is well researched, well thought-out, and worth a read if you work in the software industry.

O’Grady describes the software industry from its early beginnings to today, and considers the impact of open source, subscription models, and the cloud on many aspects of the industry. His conclusions and his summary are very interesting. However I think he as overlooked, or under-stated, two aspects.

The Drift to the Cloud

O’Grady identifies cloud computing as a disruptive element and provides details on the investments by IBM, Microsoft, and SAP in their cloud offerings, as well as the impact of the Amazon Cloud. He does a good job of describing who did what and when, but does not really get to the bottom of the “why”, other than pointing to consumer demand and disruption in the traditional sales cycle.

Here is my take on why the demand for hosted and cloud-based offerings is increasing and will continue to increase. Consider the evolution of a new business started in the last 5 years. As an example let’s use a fictitious grocery store called “Locally Grown” that cooperates with a farmer’s market to sell produce all week long instead of just one or two days a week.

  • Locally Grown opens one store in San Diego. It uses 1 laptop and desktop-based software for all it’s software needs including accounting, marketing etc.
  • Things go well and the owner opens a second Locally Grown across town. The manager of the second store emails spreadsheets of sales data so the accounting software can be kept up to date.
  • When the third store opens, the manual process of syncing the data gets to be too much. The company switches to Google Docs and an online accounting system. This is an important step, because what they didn’t do was to buy a server, and hire an IT professional to setup on-premise systems, and configure the firewall and the user accounts etc.
  • As the company grows a payroll system is needed. Since accounting and document management are already hosted, it is easiest to adopt a hosted payroll system (that probably already integrates with the accounting system).
  • Soon an HR system is needed, and then a CRM system. As each system is added, it makes less and less sense to choose an on-premise solution.

You can see from this example that the important decision to choose hosted over on-premise is made early in the company’s growth. Additionally, that decision was not made by an IT professional, it was made by the business owner or line-of-business manager. Hosted application providers aim to make it easy to set up and easy to migrate from desktop solutions for this reason. By comparison the effort of setting up an on-premise solution seems complicated and expensive to a business owner.

I love me a good analogy, and Amazon’s Jeff Bezos has a good one for the software industry by comparing it to the early days of the electricity industry (TED talk here). I think the analogy goes further than he takes it. In the early days of the electricity grid, the businesses most likely to want to be connected were small ones without electricity, and those least needing the grid were large businesses that had their own power plants. This same effect can be seen with cloud adoption as small businesses with no server rooms or data centers are the most likely to use the cloud for all their needs, and large businesses will be the slowest to migrate.

In their Q1 2016 annual report Salesforce announced $1.41 billion in subscription and support revenue. They don’t release information about their customers or subscribers but it is generally know they have over 100,000 customers. But consider that in the USA there are 28 million small businesses (less than 500 employees). These are the next generation of medium sized, and then large businesses. Even if we are generous and put Salesforce’s customer count at 200,000 in the USA (by ignoring the other 20 countries), it means that Salesforce has a market penetration of less than 1% and still generates $6bn a year. So just within the hosted CRM space in the USA alone the market is more than $600bn a year.

So, in my opinion, the current demand for hosted and cloud-based offerings is largely (or at least significantly) fueled by the latest generation of small businesses that will never have an on-premise solution. This trend will continue relentlessly, and perpetually (until something easier and cheaper emerges), and the market is vast.

New License Revenue

O’Grady does a lot of analysis of software license revenues over the past 30 years and compares it with subscription models and open source approaches. But he is missing a fact that might alter his opinions a little. Most people are unaware of this fact, because traditional software companies have a dirty little secret.

It is rational to think that when you buy a piece of software from IBM, or Oracle, or SAP, or Microsoft, that the money you give them pays for your part of the development of the software, the cost of delivering it, the cost of running the business, and a little profit on top. But this is unfortunately not the case. When you buy a new software license the fee you pay typically covers only the sales and marketing expenses. In extreme cases it doesn’t even cover those. In the past, Oracle’s sales and marketing departments were allowed to spend up to 115% of the new license revenue. Oracle was losing money just to acquire customers. In the latest Oracle quarterly report they state that new license revenue was $1,982 million and sales and marketing expenses was $1,839 million. So 92% of your license fee goes towards the sales and marketing expenses needed to get you to buy the software, and the remaining 8% doesn’t even come close to covering the rest. As a consequence, a traditional software company is not at all satisfied that you have just purchased software from them because they have made a loss on the deal.

So if the license fee does not pay for the software, what does? It is the renewals, upgrades, up-sells, support, and services that generate the income that funds the development of the software and keeps the lights on.

In The Software Paradox O’Grady talks about the massive rise in software revenue license in the 1980s and 1990s, and it’s subsequent decline. But if you consider that software license revenue is really just fuel for sales and marketing, maybe the decline can be partially attributed to innovation in the sales and marketing worlds (evaluation downloads, online marketing etc) as the heavy and expensive enterprise sales model has changed over time. Maybe the 1980s and 1990s was a sellers market and expensive software of the time was overpriced. O’Grady describes many contributing factors for the Software Paradox, maybe this is yet one more.


Written by James

June 5, 2015 at 5:21 pm

Posted in Uncategorized

Pentaho Labs Apache Spark Integration

with one comment

Today at Pentaho we announced our first official support for Spark . The integration we are launching enables Spark jobs to be orchestrated using Pentaho Data Integration so that Spark can be coordinated with the rest of your data architecture. The original prototypes for number of different Spark integrations were done in Pentaho Labs, where we view Spark as an evolving technology. The fact that Spark is an evolving technology is important.

Let’s look at a different evolving technology that might be more familiar – Hadoop. Yahoo created Hadoop for a single purpose ­ indexing the Internet. In order to accomplish this goal, it needed a distributed file system (HDFS) and a parallel execution engine to create the index (MapReduce). The concept of MapReduce was taken from Google’s white paper titled MapReduce: Simplified Data Processing on Large Clusters. In that paper, Dean and Ghemawat describe examples of problems that can be easily expressed as MapReduce computations. If you look at the examples, they are all very similar in nature. Examples include finding lines of text that contain a word, counting the numbers of URLs in documents, listing the names of documents that contain URLs to other documents and sorting words found in documents.

When Yahoo released Hadoop into open source, it became very popular because the idea of an open source platform that used a scale-out model with a built-in processing engine was very appealing. People implemented all kinds of innovative solutions using MapReduce, including tasks such as facial recognition in streaming video. They did this using MapReduce despite the fact that it was not designed for this task, and that forcing tasks into the MapReduce format was clunky and inefficient. The fact that people attempted these implementations is a testament to the attractiveness of the Hadoop platform as a whole in spite of the limited flexibility of MapReduce.

What has happened to Hadoop since those days? A number of important things, like security have been added, showing that Hadoop has evolved to meet the needs of large enterprises. A SQL engine, Hive, was added so that Hadoop could act as a database platform. A No-SQL engine, HBase, was added. In Hadoop 2, Yarn was added. Yarn allows other processing engines to execute on the same footing as MapReduce. Yarn is a reaction to the sentiment that “Hadoop overall is great, but MapReduce was not designed for my use case.” Each of these new additions (security, Hive, HBase, Yarn, etc.) is at a different level of maturity and has gone through its own evolution.

As we can see, Hadoop has come a long way since it was created for a specific purpose. Spark is evolving in a similar way. Spark was created as a scalable in-memory solution for a data scientist. Note, a single data scientist. One. Since then Spark has acquired the ability to answer SQL queries, added some support for multi-user/concurrency, and the ability to run computations against streaming data using micro-batches. So Spark is evolving in a similar way to Hadoop’s history over the last 10 years. The worlds of Hadoop and Spark also overlap in other ways. Spark itself has no storage layer. It makes sense to be able to run Spark inside of Yarn so that HDFS can be used to store the data, and Spark can be used as the processing engine on the data nodes using Yarn. This is an option that has been available since late 2012.

In Pentaho Labs, we continually evaluate both new technologies and evolving technologies to assess their suitability for enterprise-grade data transformation and analytics. We have created prototypes demonstrating our analysis capabilities using Spark SQL as a data source and running Pentaho Data Integration transformations inside the Spark engine in streaming and non-streaming modes. Our announcement today is the productization of one Spark use case, and we will continue to evaluate and release more support as Spark continues to grow and evolve.

Pentaho Data Integration for Apache Spark will be GA in June 2015. You can learn more about Spark innovation in Pentaho Labs here: www.pentaho.com/labs.

I will also lead a webinar on Apache Spark on Tuesday June 2, 2015 at 10am/pt (1pm/ET). You can register at http://events.pentaho.com/pentaho-labs-apache-spark-registration.html.

Written by James

May 12, 2015 at 1:41 pm

Posted in Uncategorized

Open Source: In praise of the profiteering enterprise, the greedy freeloader, and the selfish developer

leave a comment »

Recently, Matt Asay talked about a number of different issues causing conflict in the free and open source world in a piece titled “The new struggles facing open source” and comes to the conclusion that currently the biggest problem is the role of enterprises controlling projects (he says “controlling the community” which is impossible, if you’ve every tried it). He makes a lot of sense but I don’t entirely agree with his points about the role of businesses in open source being detrimental. Hadoop, Spark, Storm, Kafka, Hive, HBase etc all came from enterprises that still employ the majority of the core contributors in most cases. Why did these companies create these technologies? Not for philanthropy. Not for the greater good. For better profit via better infrastructure. Having created those technologies they decided to open source them. For the greater good? No. For lower maintenance, and better profits, with side benefits of better mindshare and easier recruiting. Did these companies open source their domain-specific intellectual property that is the basis of their business? No, and they never will. They only open sourced internally developed infrastructure that is tangential to their business. Do these companies believe that all ideas inherently belong to the people of the world? No. They put into open source what was in their best interest to do so. Self-interest all round. Score 1 point for greed.

In another piece titled “Enterprises still miss the real point of open source” Matt argues that enterprises, while they are using a lot of open source, still don’t get it. He finishes with:

Again, merely using open source isn’t enough. Contributions are required.

But let’s look at “The rise and rise of open source” by Simon Phipps. This is a review of Black Duck’s most recent “The Future Of Open Source” survey. The net result is that across all the important metrics usage of open source for running businesses and creating products is now over 50% for the first time. Some of the merits are still rising rapidly. 78% of respondents report they are running their business with open source software. Indicating that an approach based on, and using, open source is now the mainstream, and that purely proprietary approaches are now the minority. As a result InfoWorld is stopping their open source special interest channel, because it is now the mainstream. Yay for open source.

But who are these companies that make up these statistics, that represent the majority of businesses. Are they all contributing to open source? The survey indicates that while 78% of businesses are running on open source, only 64% of those say they are contributing to open source. What do we call the greedy who use open source but do not contribute to it? They are the Freeloaders. Matt Asay says they need to contribute. I say they already have. If the freeloaders weren’t using open source, only 49.92% (78% * 64%) of companies would be running their business on open source. In other words the only reason we can claim today that open source is the mainstream is by the actions of the (apparently) non-contributing the freeloaders. But isn’t tipping the balance of the overall market from proprietary to open source a contribution in itself? Of course it is. The act of merely using open source software displaces a proprietary alternative, and is a contribution in its own right. No matter how little you contribute, even the greedy who contribute nothing, still make a contribution. Score another point for greed.

So now lets look at the people to do contribute. The vast majority of these are paid contributors employed by enterprises and IT/software developers trying to get their job done and to rollout a product or feature. These activities include creating features, fixing bugs, translating, testing etc (the list is long). Enterprises fund these activities for several reasons including:

  • Getting a product to market
  • Lowering development costs
  • Lowering license fees
  • Improving time to market
  • Employee retention
  • Increasing mindshare and thought leadership

Philanthropy? Nope.

Developers fix bugs and contribute them to a project because they don’t want to re-apply the bug in every future release. This is self-serving behavior. Do I care who controls or directs the project? Nope. The core contributors of the project accept the bug to increase quality, which helps adoption, which grows the project. This is self-serving behavior. One of the greatest and most powerful things about open source is that everyone can act out of self-interest, and everyone gains from everyone else acting selfishly. This makes the model very strong. Score another point for greed.

Final score: Philanthropy 0, Greed 3

If open source is ultimately driven by greed and self-interest, how is it any better than proprietary software development? Because it is an inherently better way to develop software, and in so many ways that the fight isn’t even close. Is it philosophically better? Yes, I believe the fundamental principles of open source are better than proprietary development. But is it morally better? No. The underlying power of open source over proprietary development is that greed is naturally converted into useful contribution, whereas with proprietary development greed translates into channel conflict, price fixing, monopolies, class action suits, vendor lock-in, and inefficient, low-quality, bloated software.

Open source rules the day. But philanthropy and the believe that all ideas belong to the world did not get it there.

Written by James

May 8, 2015 at 8:14 pm

Posted in Uncategorized

Response to “Will There Ever Be Another Red Hat”

leave a comment »

Response to Dan Woods’ Forbes article “Will There Ever Be Another Red Hat”

This is a very nice piece. Although I wouldn’t say “never”. When IBM was riding high it was difficult to foresee the rise of Microsoft, and when Microsoft was riding high Apple was nowhere.

In the piece Dan refers to two of my blog posts about the open source business models. Dan infers that I was saying that open source companies do not need sales and marketing budgets. This is not exactly what I was trying to say. I was not saying that open source companies don’t need sales and marketing budgets. They do. The difference is that you can do a primarily or exclusively inbound sales model. This is much cheaper than an outbound model. You definitely need a marketing budget to create the inbound leads. Having an active community helps generate leads and lowers sales and marketing costs even more.

My main point was to refute the idea that the subscription model is inferior because the lack of an initial license fee hurts – in the proprietary model that license fee pays for the cost of acquiring customers and nothing more. As an example of the proprietary model Qlik is losing $25m-$30m per quarter on $125m revenue because their S&M budget is $75m. They are losing money to gain customers, with the hope that they can make enough services/support/up-sell dollars to make a profit eventually.

There are fundamental differences between the proprietary license fee model and the open source subscription model, and that unless you understand both of them well, you are not in a good position to compare them or criticize either of them (as an analyst was doing at the time).

Open source subscription models can be very successful, but their economics seems to be poorly understood by some. Amazon made $5bn last year renting out servers in the cloud. Google make billions by selling ad for 6c. In-app purchases generate billions 99c at a time. These are new and kinda weird ways to make money. Mobile, Cloud, Big Data, and IoT are changing things quickly and everyone (including analysts) will have to pay attention if they want to keep up.

Written by James

April 28, 2015 at 7:19 pm

Posted in Uncategorized

Open-Plan Offices: Silicon Valley is right, your boss is wrong.

leave a comment »

My response to this article: http://www.theage.com.au/comment/silicon-valley-got-it-wrong-the-openplan-office-trend-is-destroying-the-workplace-20150420-1molwh.html

To summarize this person’s critique of open-plan offices:

  • My boss took away cubicles (with no open line of sight) and lined us up against a wall (with no open line of sight).
  • My boss took away cubicles (with little interactivity) and lined us up against a wall (with little interactivity).
  • My boss took away cubicles (with understood rules of interaction) and lined us up against a wall (with no understood rules of interaction, and no guidance).
  • My boss took away cubicles (which encourage personal productivity) and tried to create an open-plan environment (which encourages team productivity), but failed badly.
  • All of this is the fault of open-plan offices (and not my boss).

In my job I could work from home every day, but I don’t, because team productivity is more important than any one person’s productivity. If you interview people about their personal productivity, they rarely think about the big picture, only their personal stuff. I could also have an office if I wanted, but I don’t. Again, open-plan is about team productivity (see every thing ever written about Agile). I do, however, work from home occasionally to give the others a break from my glorious wit.

A productive team creates self-governing rules. In our bullpen, if you leave your phone at your desk and it rings, when you return to your desk your phone will be in a sound-proof box. Sometimes the box will be hidden. Headphones are fine, but audible music is no-no (there are more than sufficient Nerf guns to stop that obtrusive behavior quickly). If a person in your environment is being inconsiderate, it’s not the fault of the environment, it’s the fault of the person. Blaming the environment will not solve the problem.

In a team environment, frictional conversations (the ones that just happen when people are close to each other) are very valuable. In the early 2000’s we tried a purely remote environment, and the results were not great, so when we started Pentaho, we went with a “co-locate and open-plan” approach when possible. To the extent that for the last ten years all of Pentaho’s founders (including CEO, CTO, and Chief Engineer) have never had a closed office.

Open-plan environments are not right for all teams. Any group that regularly needs phone conversations, such as Sales and Support, are not good candidates for open-plan environments. But again, it is not the fault of the environment. If an environment is being implemented wrongly or inappropriately, it is not the fault of the environment, it’s the fault of the implementor.

I’m sorry that the author of this article had a negative experience with an open plan office. But it’s not true that Silicon Valley got it wrong. Your boss got it wrong.

Written by James

April 23, 2015 at 1:18 am

Posted in Uncategorized

Are individuals born with characteristics that dispose them to enterpreneurship?

leave a comment »

This is my answer to this question on Quora: http://www.quora.com/Are-individuals-born-with-characteristics-that-dispose-them-to-enterpreneurship

I think this is true to a certain extent. I also think that, in many cases, those characteristics align with the traits that come with ADD/ADHD.

Google “entrepreneurs ADHD” and take a look at the results or read this Forbes article for more details: ADHD: The Entrepreneur’s Superpower

Some famous entrepreneurs who have acknowledged their ADHD (source: Famous People with ADHD – Adult ADD Center of Maryland)

* Richard Branson (founder of Virgin Records, Virgin Atlantic, Virgin Galactic etc)
* David Needleman (founder of JetBlue Airlines)
* Alan Meckler (founder of several magazines and companies)
* Paul Orfalea (founder of Kinkcos)
* Charles Schwab (found of Charles Schwab)
* Walt Disney (founder of Walt Disney)

Historical Figures who showed characteristics of ADHD (source: Famous People With ADHD Traits)

* Abraham Lincoln
* Robert F. Kennedy
* John F. Kennedy, who (allegedly) smoked pot to help him focus.
* Benjamin Franklin
* Henry Ford
* Thomas Edison
* Leonardo da Vinci
* Alexander Graham Bell
* Orville and Wilber Wright
* Sir Isaac Newton
* Albert Einstein

As my friend Marten Mickos says, follow your passion and believe in willpower.

But if you have (manageable) ADHD, it might help.

Written by James

April 16, 2015 at 7:19 pm

Posted in Uncategorized

My thoughts on “Why Women Shouldn’t Code”

leave a comment »

Here is the original article: https://medium.com/@hardaway/why-women-shouldnt-code-82205165e64a

Firstly, the title is attention grabbing nonsense. The article is about why women (girls) should not be forced to learn to code at school.

There are a some good points in the article. In general, today, women are less interested in software careers, and software degrees than men are. That’s a fact.

Companies like to promote from within, but many (male) software engineers make lousy managers and directors. Myself included. So finding good software engineering managers is hard. That’s a fact. Imagine Sheldon Cooper managing a team of 10 Sheldon Coopers. What a nightmare. The best managers I ever had as a software engineer were all women. The others were all men. So I would like to see college and vocational courses just for Engineering Management. That might be a role that attracts more women than men. If so, great.

I don’t agree that women shouldn’t code. I don’t agree that girls should not be required to learn coding at school – and here is why. At school we learn Art, and Music, and History, and Sports. How many of us become professional artists, musicians, historians, or athletes? Almost none of us, but that’s not why we learn them.Learning the about those subject enriches us and gives us context. So much of our world today is driven by and dependent on computers, that a basic understanding of how they are controlled seems, to me, to be more important than knowing who won the Battle of Antietam and when.

If teaching girls to code results in more female software engineers, that’s great, if it doesn’t, that’s ok too, because they know more about the world than they did before.

Written by James

April 15, 2015 at 11:13 pm

Posted in Uncategorized