The Artima Developer Community
Sponsored Link

Weblogs Forum
Why is Distributed so Hard?

8 replies on 1 page. Most recent reply: May 4, 2004 11:21 AM by Dale Asberry

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 8 replies on 1 page
Dale Asberry

Posts: 161
Nickname: bozomind
Registered: Mar, 2004

Why is Distributed so Hard? (View in Weblogs)
Posted: May 3, 2004 3:52 AM
Reply to this message Reply
Summary
If developers don't "get it", then it makes great sense to find out why. One issue in particular jumps to the forefront.
Advertisement

Overview

In one sentence, here's why: humans are notoriously bad at keeping "self" distinct from "other". Egomania, projection (transference), and enmeshment are well-known symptoms of this problem. OK, so I hear you saying, "yeah, but what does this have to do with programming?" It certainly seems absurd to suggest that if we are bad at something we know the most about (our "selves"), how could we possibly say that we have a good approach for the programming analogues - objects, modules, etc.

A recent offline conversation with Paul Hammant, co-lead of PicoContainer, (Inversion of Control/Dependency Injection) was a primary motivator for this blog entry. IoC is a simple, elegant way to encapsulate dependencies, yet according to Paul, it's taken years for the developer community to see the value in using it. So, programmers have serious issues with not seeing and removing dependencies. In light of distributed solutions, which by their very nature require considerable thought and effort in removing dependencies, and that human tendencies interfere with eliminating them, then it should be no surprise that distribution is hard.

The Problems

Not only are we bad at it as humans, several other factors feed back and further discourage decoupling: it's a hard problem to solve even when it can be seen; the less it is practiced - the harder it gets to undo; and, the original team is usually gone by the time the problems become insurmountable.

What in particular makes decoupling a hard problem to solve? For one, it takes a lot of experience and a bit of natural ability to perceive software at different levels of abstraction. Without that combination of (learned) skill and (natural) ability, it is hard to see that a dependency exists AND then to remove that dependency. For another, it's human nature to take the easy/lazy route first. I get a kick out of the demotivational poster Procrastination : "Hard work often pays off after time, but laziness always pays off now."

In addition, when programmers fail to implement decoupling early, the coupling/cohesion problems start to snowball. Coupling problems tend to manifest themselves indirectly through unexpected side-effects. The first response is usually to add more code to either limit or expand functionality. Over time, this sort of coding crud piles up. Sure refactoring will help in cleaning some of these problems, however, this approach often misses "big picture" interactions. On the topic of agile, refactoring and YAGNI are great general principles. But to clarify, they in no way suggest that programmers should skip designing in loose-coupling. In fact, the first CouplingAndCohesion c2.com wiki entry by Wyatt Matthews says that loose coupling and YAGNI need to go hand-in-hand:

  • This dictates against PrematureGeneralization. After all, PrematureGeneralization increases an object's dependencies needlessly.
  • This supports DontRepeatYourself. If I am too interdependent, I've probably repeated myself needlessly and should RefactorMercilessly.
  • This supports OnceAndOnlyOnce. DontRepeatYourself recommends OAOO whenever possible.
  • This wants to support YAGNI. After all, if YouArentGonnaNeedIt, why head toward more instability?

The last form of negative selection also becomes significant over time: the original team is gone. This problem is two-fold -- first, no one is around to take responsibility for the problems, and second, a great deal of the application knowledge base left with the original team members. The first problem is particularly compounded by consulting. Most consultants are in-and-out in less than a year. My experience suggests that the most serious problems start cropping up around the one year mark. Since consultants are under heavy pressure to "deliver", why should they invest their time in the difficult and time consuming task of eliminating dependencies? Because the problems are subtle, widespread and not usually associated with high coupling, the consulting company rarely takes heat for their role in the problems. Considering that coupling issues become apparent much more quickly in a distributed system, it should come as no surprise that consultants recommend avoiding it.

Some Solutions

The first step in finding a solution is in recognizing the source of the problems. Let's first look at the human analogues.

Enmeshment can be slowly eliminated by unweaving the co-dependencies. Listing and communicating previously unspoken expectations is a great start. Difficulties encountered when this process starts are due to being overwhelmed by the sheer number of unspoken expectations, and being surprised by the fact that they were obliviously unaware of most of those expectations.

In some ways, marriage vows do a disservice to the richer subtleties in intimate human interaction. Namely, two people don't come together to become one, they come together to become three! There will always be the self and the other. The third ingredient is the relationship itself. Unless the relationship is tended to by both people, it won't work. The most successful couples:

  • address expectations through rituals (protocols) and communication
  • adjust to the situation
  • are forgiving of transgressions
  • interdependent, neither independent nor dependent

The second step in solving problems is to discover and leverage the lessons others have learned. The following excerpts only touch the tip of the iceberg. Read, study, and internalize these references - make them your own.

Carlos Perez' wrote recently in his Manageability blog about six operators of modularity that are essential for modular systems:

  • Splitting - Modules can be made independent.
  • Substituting - Modules can be substituted and interchanged.
  • Excluding - Existing Modules can be removed to build a usable solution.
  • Augmenting - New Modules can be added to create new solutions.
  • Inverting - The hierarchical dependencies between Modules can be rearranged.
  • Porting - Modules can be applied to different contexts.

from c2.com AbstractInteractions:

Context

The context in which one uses a component includes the other components with which it communicates. If a component makes assumptions about how those components are implemented, it becomes hard to reuse in combination with different components.

Problem

How can you reduce a component's dependence on other components in its context?

Solution

  • Define protocols by which components interact separately from the components themselves.
  • Codify these protocols as abstract interfaces
  • Implement components that rely only on abstract interfaces. That is, components should refer only to abstract interfaces not to concrete classes that conform to the interfaces.

Bill DeHora's Weblog Entry Foundations for component and service models also contains a wealth of ideas. In bullet form,

  • Java best practices won't help
  • Avoid changing or extending the interface methods
  • Control change by using a dictionary interface
  • Calls should return documents not objects
  • Avoid binary compatability
  • Don't confuse an API with a contract
  • Version the contract
  • Don't build an API for data transfer

While researching backlinks to Bill's article, I found that Carlos had summarized several properties of loose coupling:

Dimension Tight Coupling Loose Coupling
Interface Class and Methods REST like (i.e. fixed verbs)
Messaging Procedure Call Document Passing
Typing Static Dynamic
Synchronization Synchronous Asynchronous
References Named Queried
Ontology (Interpretation) By Prior Agreement Self Describing (On The Fly)
Schema Grammar Based Pattern Based
Communication Point to Point Multicast
Interaction Direct Brokered
Evaluation (Sequencing) Eager Lazy
Motivation Correctness, Efficiency Adaptability, Interoperability
Behavior Planned Reactive
Coordination Centrally Managed Distributed
Contracts By Prior Agreements, Implicit Self Describing, Explicit
Transactions Pessimistic Optimistic
Classification Classes Prototype

(2004-05-04 Updated: from Loosely Coupled Dimensions [Updated])

For the final reference, all I can say is "Wow". This paper contains so much good stuff that it should be included in every developer's library. Read Caterpillar's Fate on the c2.com Wiki to see what I mean.

Conclusions

Hopefully it isn't too difficult to see that people are bad at recognizing differences between self and others. It should also not be too hard to see that these difficulties would spill over into programming.

We should also recognize that other factors aggravate the problem and further discourage decoupling. One, it's a hard problem to see, let alone, to solve. Two, if developers don't keep on top of, and continually remove, dependencies, the harder it gets to remove them. Third, these problems are exacerbated by the fact that the original team (and especially consultants) leave the project taking hard-learned domain expertise with them sometimes to avoid the negative consequences of blame.

Distributed applications are particularly hard-hit by these things. Distributed applications are inherently more difficult than "standard" applications because they simply cannot be built without making them very decoupled. (See Carlos' coupling table above).

On the positive side, many very bright people have seen solutions to overcoming these dependency problems. The solutions are subtle and complex which means that they can't be applied in cookie-cutter fashion. If developers assimilate these solutions into their personal knowledge base then all types of applications can be taken to the next level of complexity.

../images/ik.gif

Resources

Definitions

Egomania
An intense and irresistible love for yourself and concern for your own needs

http://www.cogsci.princeton.edu/cgi-bin/webwn?stage=1&word=egomania

Projection
A type of defence mechanism. A person experiences an emotion or thought that they can't deal with exactly for whatever reason. The unacceptable feeling or thought is experienced as though someone else had been thinking or feeling it.

http://www.mentalhealth.org.uk/wordbank.cfm?wordid=551&wbletter=P

Enmeshment
The term "Enmeshment" comes from the family systems theory tradition. Enmeshment refers to a condition where two or more people weave their lives and identities around one another so tightly that it is difficult for any one of them to function independently. The opposite extreme way of relating, Detachment, refers to a condition where the people are so independent in their functioning that it is difficult to figure out how they are related to one another. Healthy relationships are thought to be described by the space between enmeshment and detachment.

http://mentalhelp.net/poc/view_index.php?idx=37&id=156

Interdependent
Mutualist, mutually beneficial -- (mutually dependent)

http://www.cogsci.princeton.edu/cgi-bin/webwn?stage=1&word=interdependent


Michael Feathers

Posts: 448
Nickname: mfeathers
Registered: Jul, 2003

Re: Why is Distributed so Hard? Posted: May 3, 2004 8:31 AM
Reply to this message Reply
I agree. All of those things are tough. They are hard for people to see and act upon.

The one thing that keeps me optimistic is noticing that when people start to see some advantage to creating classes in test harnesses through TDD, they start to create loosely coupled software by default and they start to notice what it looks like and how it is different from the intermingled stuff.

Sometimes that is more effective than telling people. The thing that is hard to face is that many people are not as facile with abstraction as others and they have to approach this kind of learning through experience.

Carlos Perez

Posts: 153
Nickname: ceperez
Registered: Jan, 2003

Re: Why is Distributed so Hard? Posted: May 3, 2004 9:24 AM
Reply to this message Reply
You might want to reference Conway's Law in your arguments.

http://www.manageability.org/blog/stuff/schemaless_world/view

Distribution is hard because achieving consensus is equally hard.

Dale Asberry

Posts: 161
Nickname: bozomind
Registered: Mar, 2004

Re: Why is Distributed so Hard? Posted: May 4, 2004 6:23 AM
Reply to this message Reply
Thanks for the reference, however, the link titled "Conway Way's Law" (http://www.phptr.com/isapi/product_id~%7B125695DD-0FED-4A4B-AFB1-6CC1A72E11A3%7D/articles/index.asp) appears to be broken.

Dale Asberry

Posts: 161
Nickname: bozomind
Registered: Mar, 2004

Re: Why is Distributed so Hard? Posted: May 4, 2004 6:50 AM
Reply to this message Reply
I've also updated the table to reflect the changes at http://www.manageability.org/blog/stuff/loosely-coupled-dimensions/view

That's good stuff Carlos.

Also, I think the causes of your argument, "because achieving consensus is equally hard," can be reduced to a combination of the three problems: Egocentrism, Projection, and Enmeshment. Tying these problems with "Unskilled and Unaware of it" (http://www.apa.org/journals/psp/psp7761121.html), makes designing by consensus an effort in frustration and futility. That leads to the next equally hard problem of deciding "who". That's one I'm still thinking about.

Dan Creswell

Posts: 49
Nickname: dancres
Registered: Apr, 2003

Re: Why is Distributed so Hard? Posted: May 4, 2004 7:29 AM
Reply to this message Reply
> Thanks for the reference, however, the link titled "Conway
> Way's Law"
> (http://www.phptr.com/isapi/product_id~%7B125695DD-0FED-4A4
> B-AFB1-6CC1A72E11A3%7D/articles/index.asp) appears to be
> broken.

How 'bout this instead?

http://www.informit.com/articles/article.asp?p=26567&redir=1

Carlos Perez

Posts: 153
Nickname: ceperez
Registered: Jan, 2003

Re: Why is Distributed so Hard? Posted: May 4, 2004 8:55 AM
Reply to this message Reply
IANAP (I am not a Psychologist), however you may need to go beyond individual psychology and into group psychology to analyze why people can't get agreement. However, I think we can all come up with plenty of motivations why cooperation isn't a good thing (e.g. employment security, resume padding, empire building etc.)

David Ramsey

Posts: 34
Nickname: dlramsey
Registered: Apr, 2002

Re: Why is Distributed so Hard? Posted: May 4, 2004 11:15 AM
Reply to this message Reply
I get a URL not found response for http://c2.com/ppr/catsfate.htm

Is there a link available that works for that?

Dale Asberry

Posts: 161
Nickname: bozomind
Registered: Mar, 2004

Re: Why is Distributed so Hard? Posted: May 4, 2004 11:21 AM
Reply to this message Reply
My bad... copy+paste problem. http://c2.com/ppr/catsfate.html should work. I've updated my original entry too.

Flat View: This topic has 8 replies on 1 page
Topic: Where do I start with Jini? Previous Topic   Next Topic Topic: At the network's edge, is software a service business?

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use