emblemparade.com

Why Your Next Project Should Be in Java

Originally published on LiveJournal, 5.6.07

Why should you start your next project in Java? Though competitive in many fields, it doesn’t seem like a real killer solution. If you’re writing a desktop application, there are many RAD tools with mature cross-platform libraries. You can use wxWidgets, U++, GTK and other libraries with C++ and Python, and there are comprehensive solutions, akin to Java, like Tcl/Tk and .NET. If you’re writing a web applet, though Java was the first to offer the capability, Flash has gone far beyond Java in performance and acceptance. On the server, too, there are other options. PHP has been used to make dynamic web pages long before Java, and plenty of solid web sites are based on it. And if you’re making data-driven web applications, Python and Ruby might seem like better choices.

Where Java blows away all other platforms, makes them all look like ridiculous little toys, and is really the only serious choice, is in the enterprise.

So, what’s an enterprise?

A good example is a commercial bank. Now, banks have been one of the first customers of computers, from the old day of IBM Mainframes. Since then, they’ve grown by leaps and bounds, heavily investing in information technology (IT). In addition to Mainframes, which still exist and have since evolved, you have complex distributed systems. There are VAX and Unix-like server farms, workstations, terminal servers, networks, all of these evolved to solve different parts of the bank’s information empire: bank accounts, tax handling, transactions with other banks, costumer databases, marketing, employee records, building management, email and communication, etc. There’s no single piece of hardware and software in existence that can handle all of this. And, many of these pieces have been custom developed for the bank’s needs, using many different technologies, languages and platformed, and have been fine-tuned over time. They work well, employees know how to use them, and messing with them could easily lead to disaster. To overhaul and unify the system could cost billions, and to what end? In the future, the bank may further diversify, and pretty soon would need, again, an overhaul. This complex, heterogeneous mess of technologies is “the enterprise.” It’s a fact of information technology, and it’s not going away.

Most of the money made from computers today, from both hardware and software, is in the enterprise, not from the home PC consumer. It’s always odd when home users complain about, say, the Windows operating system, without realizing that the "home edition,'' or whatever they’re using, is an engineering afterthought, at best a marketing ploy to keep the enterprises as Microsoft’s customers. Windows is an operating system tailored to work in such environments, and work well with Microsoft’s enterprise solutions. Apple’s success with home computers has a lot to do with changing this focus. But what, exactly, are these “enterprise solutions”?

It obviously takes a massive crew of people to maintain the enterprise 24/7, and in many specialized fields:

First, there are computer administrators and operators: people who make sure the computers running the software stay on. This is the oldest job in the enterprise, and has changed remarkably little since the old days. This industry is, in fact, the main reason for such conservatism in operating systems. The Unix administration tools we use today have remained the same more or less for two decades. They work and employed administrators know how to use them. Why invest in change?

A newer field is network management. The enterprise requires very complex networks for connecting the different systems, and for allowing end users (through terminals or PCs) access to all these services. In addition to cables, switch, satellite links, and other infrastructure, the main pieces of the network are routers and other software-based boxes. Why do we need them? The original information systems weren’t designed to work together, because it was in the interest of every IT provider, such as Digital, Sun or Microsoft, to lock enterprises into buying only their products. Indeed, Microsoft’s current monopoly-like business practices inherit directly from those older IT companies (Digital, in fact, was far more brutal about their VAX platform than Microsoft is about Windows). Of course, a whole industry came into being to make these proprietary networks, if not talk to teach other, then at least coexist peacefully on the same infrastructure, saving a lot in infrastructure and maintenance money. Companies like Cisco grew entirely from making such network boxes. I’ve worked in network management for a few years, and can tell you that it’s very complex, and requires learning a lot of protocols and router configurations. Obviously, there are many security issues (firewalls, encryption and more complex technologies) involved in making sure that the data can’t be stolen, and so network security developed as a sub-industry.

And, finally, there are the people who make sure that the different systems talk to each other. This industry is middleware, and Java-based solutions have quickly moved in to make almost all other solutions irrelevant. Let’s use our bank for an example scenario: say, a customer walks into a bank and wants to draw money from one of his credit cards into a checking account, because his account is almost empty and he needs some money in it order for a check he wrote not to bounce. A great many different information systems come into play. First, the clerk must be logged in to a local server with specific permissions. This local server may then access another system which deals with employee security permissions across the enterprise. The customer’s list of services may then be accessed. And, remember, in addition to checking accounts, there may be credit cards, savings, loans, and a host of other services added over the years, which may exist on entirely different systems (located in different parts of the country, too). Now, in addition to all the many things that can go wrong (a tornado tears down a cable connection), there may be normal limitations that would make this whole transaction fail. For example, the customer may have reached the credit limit for the cycle, but may have an option to pay a higher interest in order to draw more. And this, of course, is just one simple example of the many millions of different kinds of transactions that a bank has to deal with every day. Programmers, of course, spend a lot of time making these tasks work smoothly. Middleware, however, can make it much easier, by standardizing the way the different information systems interact.

Middleware can work in three ways:

First, middleware can run on each of the information systems, “wrapping” the proprietary functionality with a standardized interface. In the case of very old systems, this may mean adding another system, a controller, whose whole job is to connect this older system to the enterprise. In case the proprietary system changes, the controller would have to be updated, too, but all the tasks relying on the system would continue to work the same. This server-based approach is one of the oldest.

A second, client-server approach is to add yet another information system, called an application server, which knows, through various plugins, how to connect with all the other systems. Then, instead of talking directly to another system, requests can be made to this main switchbox. In case one of the information systems changes, you would just remake the plugin. The huge advantage of centralizing the controller action is in allowing for transactions. Transactions are key to enterprise workflow. If each information service offers atomic operations, such drawing a cash advance from a credit card, or adding money into a checking account, then transaction are the molecules, such as moving that cash advance into the checking account. In-between states are not allowed, and the transaction server guarantees that. Thus, if something fails somewhere in the transaction, all the atomic operations can be rolled back. It’s a complex system: the transaction server has to store these transactions in a database if, say, for cases in which there’s a power failure, and maintain a queue for transactions, in case the enterprise is highly complex. And, of course, there are issues of security. This centralized approach also has advantages in maintenance, but it does add another complex information system, especially if you’re dealing with a very large enterprise. It can easily mean yet another server farm, to make sure all the other server farms can talk to each other.

There is also a third, distributed hybrid approach. Different information systems use simple controllers to give minimal standardization, while the application server, instead of doing the controllers' jobs, is left only to reroute information and handle transactions. This makes for simpler application servers, at the cost of losing some centralization.

(Note that a whole new industry, called "enterprise management,or "monitoring piggybacked on middleware, for all three approaches. With standardized ways of connecting, came also standard ways of monitoring these various information systems, to make sure they are alive and healthy, and allowing a central place — another database! — for them to report potential problems.)

An early comprehensive solution to the centralized approach was CORBA: Common Object Request Broker Architecture. CORBA controllers would expose themselves to the network as objects, while ORBs, the servers, would handle transactions. The nice thing about CORBA is that it’s an open architecture, and different vendors, for controllers or servers, can implement it as they like. This nice thing, however, has ended up being a huge problem. Though the different technologies can communicate with each other well enough, the engineers can’t. Because there are so many different underlying technologies, spanning almost four decades of IT development, no single team of engineer can encapsulate all this knowledge. Thus, CORBA ended up being enormously expensive to employ. Microsoft has its own hybrid solution, called DCOM (Distributed Common Object Model) which works with the central Microsoft Transaction Server (MTS). Though much cheaper than CORBA, because it relies on Microsoft’s simple development technologies, it requires controllers to run Windows, and prefers clients to do so, too. (It’s this Microsoft monopoly that enterprises really worry about, not the fact that all PCs come with Windows or that Windows comes with Internet Explorer!)

Enter Java. As with CORBA, Java has its own comprehensive middleware architecture, called Enterprise JavaBeans (EJB). However, unlike CORBA, which doesn’t define the implementing technology, EJBs require Java. This restriction is actually the solution. Because Java runs on a virtual machine any controller, any server, and any client can run Java. Vendors have created Java virtual machines for an astounding diversity of platforms. IBM has especially invested in excellent Java support for their equipment, including Mainframes and the ubiquitous, multipurpose AS/400 server. Most importantly, a single team of engineers can maintain all this code centrally, using one language and one technology. They can even create simulated testing environments very easily. The Java Enterprise Edition platform is open, and many vendors have implemented it, so there’s no vendor monopoly is there is with DCOM. There are even free, good, innovative, open-source implementations of it, such as JBoss, which with it’s Seam platform offers extremely easy object self-management via Inversion of Control. On the other side of the Java spectrum, there’s Java Micro Edition, which many people will recognize from their Java-enabled cellphones. Private consumers are, as for operating systems, a side interest: the real advantage is in allowing for very cheap controllers running Java and connecting information systems to the enterprise, allowing for the hybrid distributed approach I mentioned above (Microsoft’s Windows CE operating system, which you see in many hand-held computers, was also developed specifically for controllers). There’s even a standard technology called JINI, which allows local networks of these Java-enabled devices to more-or-less manage themselves.

In short, for an enterprise to invest in Java is a no-brainer: there’s a lot of healthy competition, broad support, no vendor lock-in, plenty of Java engineers, and it’s overall very affordable.

Java also allows enterprises to move outside their limits, and that’s where we come back to where we started. Our bank, for example, may want to allow costumers to access their bank accounts online. Though this is a data-driven web application scenario, you can forget about using Ruby on Rails to handle it. The Java application servers already support web services and web application out of the box, and of course easily link to rest of the enterprise. Once again, the same body of code can be used to handle everything. Do you need convincing? ebay is run entirely on Java.

It also should be a no-brainer for you to chose Java if you are developing a new web application. Though there might be certain innovative advantages to some platforms, most notably Ruby on Rails, in the long run any application that isn’t in Java is pretty much an island for the enterprise, yet another information system that would have to be integrated. If you want to be an island, that’s fine, but if you ever want to grow, and to be taken seriously by enterprise clients such banks, corporations, military institutes, universities, factories, etc., you need to be enterprise ready. Merely choosing Java is many steps ahead in that direction. And, meanwhile, Java is evolving quickly, with EJB version 3 using Inversion of Control (like Seam), increasingly powerful database connectivity, Faces offering component-based web application development, and a few interesting projects meant to offer Rails' advantages in Java. Innovation from outside reaches Java remarkably quickly.

But could Java face competition in the enterprise from these other technologies, coming from the web inward? I really, really doubt it. There’s nothing like JBoss for Ruby (it took a decade to develop all the pieces of JBoss!) and no Ruby for AS/400. Java is already ubiquitous, and Sun has recently gone one step further in released their own virtual machine as open source, guaranteeing Java’s adoption on even more platforms (open source platforms, most notably Linux, have made astounding inroads into the enterprise).

From my responses to comments on this article:

... By the way, I was involved pretty early on with DCOM, and it was really seen by Microsoft as a way to solve many of the CORBA issues. COM was really quite brilliant and far-reaching as a way to allow small binaries to move around the enterprise. As Microsoft saw it, vendors wouldn’t want something like Java to move around, because they would be worried about people “disassembling” their objects and getting the precious code. So, even an ActiveX running in your web page (just a fancy name for COM deployment, really) would be opaque. But, as it turned out, to make a binary component work in so many environments required a very, very well defined memory model. COM architects called it “apartments,” and as complex as it was, it was far simpler than CORBA. As I said, it was really amazingly well thought-out for the problem. But, it seems, enterprises don’t seem to much like opaque binaries swooshing through their infrastructures, putting them at the mercy of middleware vendors. I think it’s the same unease that brought them into open source operating systems, too, and which crushed companies like Digital. So, as brilliant as DCOM was, there was simply no market for it. Which is, in the end, a good thing.

... I did, indeed, mean that by “island,” and you’re right that even Java needs integration, but as I said later, merely choosing Java takes you quite a few steps forward towards integration. There are two aspects of it, both language and platform. It’s not that they each have some magical advantage, just that both aspects together are a “slam dunk,” as George Tennet would say.

We both know that “clearly defined interfaces,” that is, treating whole software domains as black boxes that merely offer specific services, doesn’t quite work as planned. What one programmer envisions as a perfect interface that would suit all future needs, ends up being not quite what the user wants. It’s a bit stupid to expect that any application could be such divided (actually, having worked for Microsoft, I’d say that their downfall has to do with exactly such an organization for their products). Now, in the case of an operating system API, you’re pretty much stuck (unless it’s open source and you can invest the effort in tinkering with it), but in the enterprise, why not change the interface? With CORBA, you might have something similar to the API problem (for some reasons you mention in your next comment), but with Java, especially if you’re using J2EE, it’s not such a big deal to jump in. It can actually be quite straightforward to look at the old code and roll out a few new beans that do what you want them to do. And, if you made this choice early on, it may actually be the same programmer who wrote the original interface in the first place who’s working on the new project. What could be a huge problem becomes trivial to solve. The opposite of this, I called "an island."

Of course, Java implements CORBA quite nicely, smoothing out all those memory handling issues you mention in your comment. But the CORBA implementation uses IDL and configuration files and such nonsense. EJB3, using Java annotations, is just so deliciously elegant. It’s almost a pleasure to program.

You’re right about JavaSpaces, but JINI controllers is what it was originally envisioned for. Actually, it’s not too different, is it? And really, if you look back, the whole Java platform was envisioned for controllers! I don’t think Azureus or Tomcat were on the horizon...

By the way, those niche domains are SUCH crapshoots. I mean, it’s great if you have the skills, but the only way to get them is by randomly being assigned to some shit project in a shit job. Otherwise, it’s not like you can sit at home and learn FORTRAN, put it on your resume, and expect the $$$ to roll in. You need experience, but it’s experience that’s hard to come by in the first place. I actually can’t think of a “plan” by which you would get into that field. However, it seems that “entry level” Java developers get hired left and right these days. There’s just so much little coding that needs to be done to keep this all sticking together.

... I’d add that we’re getting close to finally not believing in magic solutions that make parts of the infrastructure “invisibile.” We’re seeing a return to programming practices in which the engineer actually has access to all the pipes and structures. Thus versioning, inheritance, etc. I’d add that Ruby’s scaffolding construct is yet one more step in the awesome direction of transparent code reuse, rather than opaque reuse as you have in those old abstract "black boxes."

You’re absolutely right that remote interfaces are vastly different from local. I’ll add, too, that we’re getting better at handling component state. Some objects handle their own state, but others don’t, and this leads to vastly different programming models, for which no one-size-fits-all solution can work. It’s IoC and frameworks like Spring and Seam that allow the programmer control how to handle state, rather than give control to some application server. Again, more transparency for the programmer, less black boxes, better software. QED, RIP, FTW.