The Artima Developer Community
Sponsored Link

Weblogs Forum
"hi there".equals("cheers !") == true

24 replies on 2 pages. Most recent reply: May 27, 2003 4:01 PM by Vlad Roubtsov

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 24 replies on 2 pages [ 1 2 | » ]
Heinz Kabutz

Posts: 46
Nickname: drbean
Registered: May, 2003

"hi there".equals("cheers !") == true (View in Weblogs)
Posted: May 21, 2003 3:08 AM
Reply to this message Reply
Summary
Java Strings are strange animals. They are all kept in one pen, especially the constant strings. This can lead to bizarre behaviour when we intentionally modify the innards of the constant strings through reflection. Join us, as we take apart one of Java's most prolific beasts.
Advertisement

Whenever we used to ask our dad a question that he could not possibly have known the answer to (such as: what's the point of school, dad?) he would ask back: "How long is a piece of string?"

Were he to ask me that now, I would explain to him that String is immutable (supposedly) and that it contains its length, all you have to do is ask the String how long it is. This you can do by calling length().

OK, so the first thing we learn about Java is that String is immutable. It is like when we first learn about the stork that brings the babies? There are some things you are not supposed to know until you are older! Secrets so dangerous that merely knowing them would endanger the fibres of electrons pulsating through your Java Virtual Machine.

So, are Strings immutable?

Playing with your sanity - Strings

Have a look at the following code:

public class MindWarp {
  public static void main(String[] args) {
    System.out.println(
      "Romeo, Romeo, wherefore art thou oh Romero?");
  }
  private static final String OH_ROMEO =
    "Romeo, Romeo, wherefore art thou oh Romero?";
  private static final Warper warper = new Warper();
}

If we are told that the class Warper does not produce any visible output when you construct it, what is the output of this program? The most correct answer is, "you don't know, depends on what Warper does". Now THERE's a nice question for the Sun Certified Java Programmer Examination.

In my case, running "java MindWarp" produces the following output

C:> java MindWarp <ENTER>
Stop this romance nonsense, or I'll be sick

And here is the code for Warper:

import java.lang.reflect.*;

public class Warper {
  private static Field stringValue;
  static {
    // String has a private char [] called "value"
    // if it does not, find the char [] and assign it to value
    try {
      stringValue = String.class.getDeclaredField("value");
    } catch(NoSuchFieldException ex) {
      // safety net in case we are running on a VM with a
      // different name for the char array.
      Field[] all = String.class.getDeclaredFields();
      for (int i=0; stringValue == null && i<all.length; i++) {
        if (all[i].getType().equals(char[].class)) {
          stringValue = all[i];
        }
      }
    }
    if (stringValue != null) {
      stringValue.setAccessible(true); // make field public
    }
  }
  public Warper() {
    try {
      stringValue.set(
        "Romeo, Romeo, wherefore art thou oh Romero?",
        "Stop this romance nonsense, or I'll be sick".
          toCharArray());
      stringValue.set("hi there", "cheers !".toCharArray());
    } catch(IllegalAccessException ex) {} // shhh
  }
}

How is this possible? How can String manipulation in a completely different part of the program affect our class MindWarp?

To understand that, we have to look under the hood of Java. In the language specification it says in ?3.10.5:

"Each string literal is a reference (?4.3) to an instance (?4.3.1, ?12.5) of class String (?4.3.3). String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions (?15.28)-are "interned" so as to share unique instances, using the method String.intern."

The usefulness of this is quite obvious, we will use less memory if we have two Strings which are equivalent pointing at the same object. We can also manually intern Strings by calling the intern() method.

The language spec goes a bit further:

  1. Literal strings within the same class (?8) in the same package (?7) represent references to the same String object (?4.3.1).
  2. Literal strings within different classes in the same package represent references to the same String object.
  3. Literal strings within different classes in different packages likewise represent references to the same String object.
  4. Strings computed by constant expressions (?15.28) are computed at compile time and then treated as if they were literals.
  5. Strings computed at run time are newly created and therefore distinct.
  6. The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.

This means that if a class in another package "fiddles" with an interned String, it can cause havoc in your program. Is this a good thing? (You don't need to answer ;-)

Consider this example

public class StringEquals {
public static void main(String[] args) {
  System.out.println("hi there".equals("cheers !"));
}
private static final String greeting = "hi there";
private static final Warper warper = new Warper();
}

Running this against the Warper produces a result of true, which is really weird, and in my opinion, quite mind-bending. Hey, you can SEE the values there right in front of you and they are clearly NOT equal!

BTW, for simplicity, the Strings in my examples are exactly the same length, but you can change the length quite easily as well.

Last example concerns the HashCode of String, which is now cached for performance reasons mentioned in "Java Idiom and Performance Guide", ISBN 0130142603. (Just for the record, I was never and am still not convinced that caching the String hash code in a wrapper object is a good idea, but caching it in String itself is almost acceptable, considering String literals.)

public class CachingHashcode {
  public static void main(String[] args) {
    java.util.Map map = new java.util.HashMap();
    map.put("hi there", "You found the value");
    new Warper();
    System.out.println(map.get("hi there"));
    System.out.println(map);
  }
  private static final String greeting = "hi there";
}

The output under JDK 1.3 is:

You found the value
{cheers !=You found the value}

Under JDK 1.2 it is

null
{cheers !=You found the value}

This is because in the JDK 1.3 SUN is caching the hash code so if it once is calculated, it doesn't get recalculated, so if the value field changes, the hashcode stays the same.

Imagine trying to debug this program where SOMEWHERE, one of your hackers has done a "workaround" by modifying a String literal. The thought scares me.

The practical application of this blog? Let's face it, none.

This is my first blog ever, I would be keen to hear what you thought of it?


Donal Tobin

Posts: 1
Nickname: detobin
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 21, 2003 11:00 AM
Reply to this message Reply
I was reading through this article, and as I read I was thinking this is exactly the kind of thing that Java Specialists would love. Then I got to the end and saw it was you. Excelent article, by the way.

Vlad Roubtsov

Posts: 20
Nickname: vladr
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 21, 2003 12:37 PM
Reply to this message Reply
Heinz,
we should join forces :)

http://www.javaworld.com/javaworld/javaqa/2003-03/01-qa-0314-forname.html

For more esoterics see

http://vladium.blog-city.com
http://www.javaworld.com/columns/jw-qna-index.shtml

Heinz Kabutz

Posts: 46
Nickname: drbean
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 21, 2003 6:26 PM
Reply to this message Reply
> Heinz,
> we should join forces :)

Yeah, can you imagine ! Then we won't need Java obfuscators
anymore ;-)

I like your articles, excellent stuff. Really got me
thinking!

Heinz Kabutz

Posts: 46
Nickname: drbean
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 21, 2003 6:30 PM
Reply to this message Reply
> I was reading through this article, and as I read I was
> thinking this is exactly the kind of thing that Java
> Specialists would love. Then I got to the end and saw it
> was you. Excelent article, by the way.

Hi Donal,

I found your name as a long-time supporter of my newsletter, thanks so much for your note :-)

Heinz

Berco Beute

Posts: 72
Nickname: berco
Registered: Jan, 2002

Re: "hi there".equals("cheers !") == true Posted: May 22, 2003 4:45 PM
Reply to this message Reply
Yet another excellent article, like we are used to from your newsletter. Cheers for that, and welcome to the Artima Blogger community!

Heinz Kabutz

Posts: 46
Nickname: drbean
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 12:33 AM
Reply to this message Reply
> Yet another excellent article, like we are used to from
> your newsletter. Cheers for that, and welcome to the
> Artima Blogger community!

Thanks, Berco!

Murali

Posts: 2
Nickname: smurali
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 1:02 AM
Reply to this message Reply
Really an excelent article.

I have a question.

setObject() throws an exception if a final field is being modified. So, do you think if this modification would have failed if sun had declared this field value as final? I just glanced the code of String class and found that value is being initialized in every constructor. So i think there should be no harm in making it final.

Also, theoretically, if the String object is immuable, the underlying data structure that holds the chars - value, should be immutable. Is my understanding correct? Is there any reason why Sun has not declared it final?

Heinz Kabutz

Posts: 46
Nickname: drbean
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 1:19 AM
Reply to this message Reply
> Really an excelent article.
>
> I have a question.
>
> setObject() throws an exception if a final
> field is being modified. So, do you think if this
> modification would have failed if sun had declared this
> field value as final? I just glanced the code
> of String class and found that
> value is being initialized in every
> constructor. So i think there should be no harm in making
> it final.
>
> Also, theoretically, if the String object is immuable, the
> underlying data structure that holds the chars - value,
> should be immutable. Is my understanding correct? Is there
> any reason why Sun has not declared it final?

In my original article, I stated that if the char[] had been final, this would not have been possible. I was wrong. Even if the char[] value was final, we could still modify the contents of the char[] since there is no way of making the contents of an array constant.

Did you know that in Oak (the precursor of Java), fields could be made constant and methods final? When they moved over to Java, they changed that from const to final, but the keyword is still there.

Murali

Posts: 2
Nickname: smurali
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 1:29 AM
Reply to this message Reply
Thanks a lot Heinz

Vlad Roubtsov

Posts: 20
Nickname: vladr
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 3:10 AM
Reply to this message Reply
>So, do you think if this
> modification would have failed if sun had declared this
> field value as final? I just glanced the code
> of String class and found that
> value is being initialized in every
> constructor. So i think there should be no harm in making
> it final.

A less well known fact is that pre-1.3 Java allowed modification of even final fields (via reflection).

Starting with 1.3+ this is no longer possible in pure Java. JNI code can still freely modify final fields (as well as wreak other kinds of havoc, like calling virtual methods non-virtually etc).

Heinz Kabutz

Posts: 46
Nickname: drbean
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 8:35 AM
Reply to this message Reply
> >So, do you think if this
> > modification would have failed if sun had declared this
> > field value as final? I just glanced the
> code
> > of String class and found that
> > value is being initialized in every
> > constructor. So i think there should be no harm in
> making
> > it final.
>
> A less well known fact is that pre-1.3 Java allowed
> modification of even final fields (via reflection).
>
> Starting with 1.3+ this is no longer possible in pure
> Java. JNI code can still freely modify final fields (as
> well as wreak other kinds of havoc, like calling virtual
> methods non-virtually etc).

I did not know that! I had always thought it was possible, but when I used JDK 1.3 with the original newsletter, I could not reassign it, but I did not think of going back to 1.2. Thanks for that information :-)

It would not have been any different here, because you cannot make the content of a char[] constant.

Hristo Stoyanov

Posts: 5
Nickname: hristo
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 2:43 PM
Reply to this message Reply
Hi Heinz!
This is a good one, just like all other topics in your newsletter! Sometimes, I wish there was capital penalty for developers in the teams that abuse reflection and hack like that;-)

However, I am concerned with the defficiencies of StringBuffer you exposed recently in one of the newsletters.
Keep twisting our brains!

Thanks,
Hristo

Vlad Roubtsov

Posts: 20
Nickname: vladr
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 3:52 PM
Reply to this message Reply
> However, I am concerned with the defficiencies of
> StringBuffer you exposed recently in one of the
> newsletters.

Strings and multi-dimensional arrays have more deficiencies besides access speed: memory overhead. As an example, int[128][2] takes 246% more JVM memory than int[256] (another argument for preferring single-dim arrays when you could use those).

Roughly 50% of all Strings in a typical app could be 10 chars or shorter and they impose a ridiculous amount of overhead compared to 20 bytes you need to store 10 Unicode chars.

http://www.javaworld.com/javaworld/javatips/jw-javatip130.html

Lyndon

Posts: 2
Nickname: include123
Registered: May, 2003

Re: "hi there".equals("cheers !") == true Posted: May 23, 2003 7:05 PM
Reply to this message Reply
Self modifiying code has been around a long long time...

Flat View: This topic has 24 replies on 2 pages [ 1  2 | » ]
Topic: Refactoring To Aspects Previous Topic   Next Topic Topic: Half Sisters


Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us