The Artima Developer Community
Sponsored Link

Java Community News
Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java?

13 replies on 1 page. Most recent reply: Oct 14, 2007 2:57 PM by Isaac Gouy

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 13 replies on 1 page
Frank Sommers

Posts: 2642
Nickname: fsommers
Registered: Jan, 2002

Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 5, 2007 6:12 PM
Reply to this message Reply
Summary
Observing the architectures powering some of the most heavily visited Web sites, GigaSpaces' Nati Shalom asks why so few of them are written in Java.
Advertisement

Drawing on parallels between the requirements of some of the most heavily trafficked Web sites, on one hand, and heavily-used enterprise applications, on the other, GigaSpaces CTO Nati Shalom observes in Why most large-scale Web sites are not written in Java, that while their problems domains share similar requirements, the implementations those two types of system chose different paths to meet their scalability needs:

Most of these sites are using LAMP as the core runtime stack. Some have gone so far as to develop their own file system (Google, GFS). Some are using caching to solve the database bottleneck (memcached and the like)...

The application stack of these Web applications is very different from the stack that mission-critical applications in the financial world are built with. In the financial world, Java—and to a lesser degree J2EE—is used extensively. In recent years scalability requirements in capital markets led to a rapid shift in the middleware stack, introducing Compute Grid solutions for virtualization of CPU resources, enabling parallelization of batch applications. Data Grids were also introduced, enabling the virtualization of memory resources. Spring is becoming the common development framework in this world. ..

If we examine both worlds, we can see that both are facing similar challenges related to scalability.

Shalom asks why Web sites with high-scalability requirements did not just use the Java EE stack, and why they tend to prefer LAMP instead.


Emil Kirschner

Posts: 9
Nickname: entzik
Registered: Aug, 2007

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 8, 2007 4:58 AM
Reply to this message Reply
I think this is due to historical reasons. Popular website haven't been popular from the beginning. Most used to be very small shops with few financial resources. While J2EE is quite powerful and can scale well, starting up a web site using this technology is significantly more costly:

1) few people have the j2ee understanding level required to design, develop and deploy a full J2EE application. these people are rare and therefore more expensive.
2) you can get cheap LAMP hosting all over the place, as low as 5$ a month. This is far from being the case with J2EE. To obtain a decent level of stability you'd have to hire a dedicated server and that's expensive.

LAMP's cost efficiency enable startups to obtain a quick and cheap time to market and once the website is online, very few people consider it would be worth the effort to rewrite the entire thing in a new technology. Operational risks - regressions and stuff - are too big and the necessity to spend a lot of money for such a rewrite may be difficult to explain to investors.

nes

Posts: 137
Nickname: nn
Registered: Jul, 2004

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 8, 2007 4:21 PM
Reply to this message Reply
Banks have lots of money to throw at the problem and often over-engineer for future growth. They also prefer products of other big and stable companies.

Startups are often short on money and are mostly concerned of being first in the market. They also prefer to deal with smaller companies and open source because they have more input into the future of the tools there.

Cameron Purdy

Posts: 186
Nickname: cpurdy
Registered: Dec, 2004

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 8, 2007 8:07 PM
Reply to this message Reply
Many of the web sites mentioned are built with Java, and utilize a number of Java EE technologies. I think what Nati might have meant to ask is "Why aren't they built with Entity EJBs?" .. people seem to get confused and say "J2EE" instead of "version 1 or 2 EJB entity beans" ..

Peace,

Cameron Purdy | Oracle
http://www.oracle.com/technology/products/coherence/index.html

Todd P

Posts: 10
Nickname: taude
Registered: Feb, 2007

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 9, 2007 11:09 AM
Reply to this message Reply
Have you ever priced out licensing costs for 100 app servers with J2EE and say an Oracle backend?

Banks can afford this, and more importantly the support that comes with it. This gives the managers a warm-and-fuzzy feeling, plus a vendor to point fingers at when problems occur.

cwac5

Posts: 1
Nickname: cwac5
Registered: Feb, 2003

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 9, 2007 7:28 PM
Reply to this message Reply
> Have you ever priced out licensing costs for 100 app
> servers with J2EE and say an Oracle backend?
>
> Banks can afford this, and more importantly the support
> that comes with it. This gives the managers a
> warm-and-fuzzy feeling, plus a vendor to point fingers at
> when problems occur.

lets see... 100x (JBoss/Glassfish/etc.) = $0
and to compare apples to apples why don't we admit you can use the same DB as LAMP, bringing the total cost to... free.

Now you also have options for clustering, load-balancing, fail-over, messaging, persistence, etc. all built in.

CEO risk averse? well hey you at least have the option of support licenses.

Mike O'Keefe

Posts: 16
Nickname: kupci2
Registered: Mar, 2005

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 9, 2007 11:03 PM
Reply to this message Reply
A couple points I haven't seen mentioned elsewhere:

1. Some of the biggest sites are using LAMP since that's what was proven, reliable, and had market share and skilled programmers, when they were written. Take Lands End, one of the first e-commerce sites, written in perl, C, running on Oracle DB. Or Slashdot - perl. Still is, I think. Certainly Slashdot scales, given the "slashdot effect", when Slashdotters hit a linked site, which invariably falls over.

2. All of the sites in Shalom's list are what I would call "content management", not heavy on the business logic. I think Shalom understands this as web vs. financial. But the point here is, if you are serving up mostly static pages, without too much business logic, certainly those are different requirements than a "financials" type app. Take Fliker- photos, or YouTube - video, Digg - text messages, etc.

3. In addition to point #1, it also is important to consider the skills of the person writing the app. If the folks who wrote e-bay, as legend has it, over a weekend with a few perl scripts, or Facebook (PHP), in a similarly short time I think, followed the usual procedure required for your basic business app, they'd still be hashing out the requirements ;)

Additionally, while certainly there are tools to simplify this (Rational Application Developer, NetBeans, Eclipse, AppFuse), but I can guarantee the effort, including steep learning curve required to hack out a simple app in J2EE (or Spring, or Struts, etc), on top of learning Java, is considerably more than hacking together a few perl scripts (or, nowadays, Ruby/Rails).

To put it another way, Bill Gates scraped together 50K and bought the rights to Q-DOS. It didn't matter that even then were much better OSs out there, like CP/M. Marketshare's the key. Time enough to rewrite and build on more solid ground like NT. What's important is to take the time to do that, this will be key for Facebook.

Finally, I note that GigaSpaces has some openings for Java and .NET programmers. Not LAMP. Why?

Mike O'Keefe

Posts: 16
Nickname: kupci2
Registered: Mar, 2005

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 10, 2007 12:15 AM
Reply to this message Reply
Right, and now that I've read a few of the comments to his posts, I'm in agreement with the one that said this would be good advertisement. I think his next post he will explain his idea why (but wait! there's more!) - they have a solution, and it scales better than J2EE.

If you read the posts and documentation, they talk about JavaSpaces, grids (messaging and data), all communication in-process ( like CORBA or RMI but better). Also their business objects are "very similar to Message Driven Beans in the J2EE framework".

So - we have an alternate to J2EE, JavaSpaces and Spring-driven, but somehow there's a string attached I'll wager, and I'll need to run this on some of their software, which has the required "extensions" on their next generation app server, which seems to go against the whole idea of J2EE, i.e. a level playing field, tiers, with best-of-breed. To put it simply, my MDB won't run in their environment, and I'll need a whole different software layer to run their "similar but not quite the same" MDB. Same goes for the Database grid.

Following that idea (an advantage in J2EE is seen as a disadvantage), their benefits of their RMI solution strike me as disadvantages: Certainly you can build failover and loadbalancing in to a CORBA/RMI solution, but it seems that a better solution is to have this as an entirely separate layer, such as WebSphere XD, Citrix, VMWare, or even simpler solutions like Apache HTTP Server and clusters.

Certainly this will be fast, but with the Web Services implementations continually pushing the envelope, I'm not sure the performance improvement would outweigh the reduction in interoperability.

Anyway, it will be interesting to see his answer, and how this will be (when it's available) better than what we already have now, with clusters (yes, running on appservers), and things like WebSphere XD, which brings SLA capability to the midtier, or Terracotta. Certainly might be useful for the LAMP market, but then it might be just as easy for them to switch to .NET or J2EE, or even Erlang and Mnesia, for that matter.

My take is, any extra competition for J2EE is good.

Cameron Purdy

Posts: 186
Nickname: cpurdy
Registered: Dec, 2004

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 10, 2007 11:35 AM
Reply to this message Reply
::
<i>So - we have an alternate to J2EE, JavaSpaces ..</i>
::

Despite the claims in the article, there are many large-scale transactional web sites (e.g. eBay, Amazon, etc.) run on Java EE with high-scale clustering solutions like Oracle Coherence.

Interestingly, none of the large sites mentioned runs on JavaSpaces, even though it's been around since '98 -- longer than Java EE (previously J2EE). And the one site that I know of that used Jini extensively is running Coherence for data grid functionality.

::
<i>Certainly this will be fast ..</i>
::

That is hardly certain. Achieving durability (important for transactional systems) is no simple matter, and introducing a new element into an ecosystem means that every junction point incurs a 2PC / dual durability requirement. We replaced a Javaspaces based system recently precisely because it drastically and negatively impacted performance, all due to the durability/reliability versus performance trade-off.

In that case, taking messages off of one durable message queue and putting them into a Javaspace meant a transaction on the messaging side coordinated with a transaction on the Javaspaces side, with both sides having to write to disk for durability (and with the Javaspaces side being way slower than the Tibco side of course, since Tibco was actually architected to do just this, and has 100x the R&D resources, and has been refining their solution for many years).

The Javaspaces vendor had pitched their solution as eliminating system-to-system 2PC by being able to replace the Tibco messaging infrastructure with one built on top of a Javaspace, but apparently that only worked in a Powerpoint presentation. First, there were hundreds of other systems already integrated with the Tibco system, so it could not be replaced. Second, even if it could be replaced, why would they risk replacing a working system with an unproven one? Third, it turned out that when any actual QoS were required, the JMS on top of a Javaspace was way slower than Tibco, even without the 2PC issues. And fourth, when they actually got to testing it, it lost messages.

So like I said, higher performance is hardly certain.

Peace,

Cameron Purdy | Oracle
http://www.oracle.com/technology/products/coherence/index.html

nes

Posts: 137
Nickname: nn
Registered: Jul, 2004

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 10, 2007 12:10 PM
Reply to this message Reply
> lets see... 100x (JBoss/Glassfish/etc.) = $0
> and to compare apples to apples why don't we admit you can
> use the same DB as LAMP, bringing the total cost to...
> free.
You can't ask: "why don't big web sites use what banks use?" and then replace the whole stack of products banks use with "unproven" alternates. Banks don't use Glassfish and MySQL, they use WebSphere and DB2. Also you have to take history into consideration: MySql and PHP started in 1995, Glassfish in 2005. Of course nowadays a web programmer has more options.

Juancarlo Añez

Posts: 12
Nickname: juanco
Registered: Aug, 2003

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 10, 2007 1:40 PM
Reply to this message Reply
New companies can build their software with whatever they like.

Established companies (specially those from before digital computing) have less options. They still use and maintain hundreds of thousands of lines of legacy (often mainframe) code, and only companies like Sun or IBM can provide the glueware and middleware needed to integrate what exists with the modern stuff required by the new context. That's where Java can do what Pxxx can't.

Nati Shalom

Posts: 3
Nickname: natis
Registered: Oct, 2002

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 11, 2007 6:28 PM
Reply to this message Reply
Hi Mike

I think that your previous post summarized very well the issue and with your permission I'll quote you on my response.


"Anyway, it will be interesting to see his answer, and how this will be (when it's available) better than what we already have now, with clusters (yes, running on appservers)"


You are right to say that the things that were doing in GigaSpaces has lots of relevance to this discussion i.e. scalability.

Having said that i can assure you that the intent behind my question was from pure curiosity. I started to follow the LAMP stack more closely just recently. It seems that all of a sudden the large website companies (Google, Amazon, eBay...) decided to publish their architecture in the public. Looking into all this interesting available information made me thinking....
What can we learn from the fact that similar challenges had been dealt similarly but with different type of implementation approaches. Clearly there must be interesting lesson we can learn from it. You simply can't argue with the fact that the sites that i was referring too where able to deal with relatively large scale requirement without Java. Note i wasn't saying that you can't do it with Java, we all know that you can right?, As many said there are plenty of references to prove that. I was mainly pointing to the fact that you can do it WITHOUT JAVA and get pretty good results. Some argues that in a much simpler way.

Now I'm probably going to disappoint you since in my next post I'm not going to talk on how we (GigaSpaces) can addresses similar challenges (There are plenty of information available on that, the most recent one is is an interview by my Colleague Geva Perry: http://fishtrain.com/2007/09/26/interview-with-gigaspaces
). At this point I'd rather summarize what i learned from this interesting discussion.

Anyway i enjoined reading your analysis.

Nati S.
http://natishalom.typepad.com

Sean Landis

Posts: 129
Nickname: seanl
Registered: Mar, 2002

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 11, 2007 9:22 PM
Reply to this message Reply
I agree that a major reason many of the big web companies do not use Java is because they made choices early on and have chosen to live with them. Early companies chose C or Perl. A little later LAMP was tried and true. My understanding is that Amazon is still primarily Perl and not J2EE.

Unfortunately, Sun put forth J2EE as the answer to web development and I think that may have turned many companies away from Java. J2EE is fine for a large class of web applications but for the really demanding systems it just won't do. For mission-critical web sites that make (or lose) millions of dollars a day, it is critical to have control over the stack.

Where I work, our site was originally developed in Pro/C, C, and C++ because at the time, Java seemed too risky. Over the last two years, we have been migrating to Java. We avoided most of the JavaEE stack, but not all of it. We embrace JPA (via Hibernate), for example. We make use of several technologies in the JavaEE bundle such as JAXB, JMS, and JavaMail. We have no JSP or EJB. We have built a POJO based framework (fairly light) that does all our dispatching from Tomcat, and provides the infrastructural tools we need. Our presentation use Jamon (similar to Velocity but with Java type safety). We pretty much have total control with this approach, and the cost of maintaining our own framework (which isn't all that much) is well worth it.

For us, Java SE has provided huge rewards in scalability and performance. We have often gone in directions other then the 'conventional wisdom' but always with sound reason.

Our gains are primarily due to architectural choices: Statelessness, concurrency and parallelism, wise use of caching, for example. We also hook in third party software where it provides the big payoff: search, recommendations, etc.

We are one of the largest on-line retailers and I am very comfortable saying we are right near the top in performance and efficiency regarding our web site implementation. We have millions of customers, terabytes of data, and experience peak loads of 2,000 requests a second.

Our experience is that an all-Java solution - properly architected - is a great way to go.

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Nati Shalom: Why Are Most Large-Scale Web Sites Not Written in Java? Posted: Oct 14, 2007 2:57 PM
Reply to this message Reply
Sean Landis wrote
> Our presentation use Jamon ...

Interesting, thanks for sharing.

Flat View: This topic has 13 replies on 1 page
Topic: Eclipse Releases OSGi-Based Ajax Platform Previous Topic   Next Topic Topic: Cedric Beust's Verdict on Erlang


Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2017 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us