Non Blocking with Traditional Java IO - On the Use of InputStream.available() and Thread.sleep()

Some time ago I did quite a lot of IO in Java and I yet did not see this way of reading a InputStream from a Socket:
    InputStream in=channel.getInputStream();

    channel.connect();

    byte[] tmp=new byte[1024];
    while(true){
      while(in.available()>0){
        int i=in.read(tmp, 01024);
        if(i<0)break;
        System.out.print(new String(tmp, 0, i));
      }
      if(channel.isClosed()){
        System.out.println("exit-status: "+channel.getExitStatus());
        break;
      }
      try{Thread.sleep(1000);}catch(Exception ee){}
    }
    channel.disconnect();


This comes from a piece of code from an example of JSch , a good ssh client in java. A work collegue had the bad idea to remove the Thread.sleep call and was struggling with why it would randomly work.The way I would have done it is the following:
    InputStream in=channel.getInputStream();

    channel.connect();

    byte[] tmp=new byte[1024];
    int bytesRead = 0;
    while((bytesRead = in.read(tmp,0,1024>= 0){
      System.out.print(new String(tmp, 0, bytesRead));
    }

    if(channel.isClosed()){
      System.out.println("exit-status: "+channel.getExitStatus());
      break;
    }
    channel.disconnect();


This has the advantage of being more readable and having less secret spices in it. In the first code, the call to available() is non blocking, meaning that without the Thread.sleep(), there will never be the time for the socket buffer to fill up. But is the first code more efficient or the second code?

I did a search on google to understand the interest of the first code. The only advantages I found in the first code are the possibility to interrupt the thread running the code and a finer grained control on timeouts.

There is a lengthy explanation by Doug Lea in his book "Concurrent Programming in Java". This book usually provides excellent explanations, and is a must read for anybody doing concurrent programming. But this time, about this subject, I did not find him that clear.

There is a more simple explanation in a course from San Diego State University (see last example)
A read() on an inputstream or reader blocks. Once a thread calls read() it will not respond to interrupt() (or much else) until the read is completed. This is a problem when a read could take a long time: reading from a socket or the keyboard. If the input is not forth coming, the read() could block forever.

As usual, you should not rely on all what you read on the web, as this page (SCJP Questions & Answers) testifies:
Q. When will a Thread I/O blocked?
A:
When a thread executes a read() call on an InputStream, if no byte is available. The calling Thread blocks, in other words, stops executing until a byte is available or the Thread is interrupted.

Still I am wondering if the second code would not just go into IOException (socket timeout), on timeout (adjustable with Socket.setTimeout ) and release the Thread then. Do you have an idea when the first code could be better?

Top 10 Most Read Last Week On Javablogs.com, Week 22


Most read last week

  1. Why I'm leaving Sun and... what next? (304): Leaving Sun is possibly one of the most difficult decisions I've ever made. But I think it's time for me to start new things and, well, it's also time for Sun to start new things. [read]

  2. 9 new and *noteworthy* features in Eclipse 3.2 (280): Chris Laffra has put a big presentation of all the new and noteworthy features that are coming in the Eclipse 3.2 platform.There is lots of screenshots, [read]

  3. Don’t you just love it when Microsoft is forced to use Java? (253): Don’t you just love it when Microsoft is forced to use Java? [read]

  4. Joshua Bloch Shocking Confession: java.util.Arrays Is Broken (218): (This should be all over the net by now. I first saw it here. [read]

  5. Signs You're a Crappy Programmer (and don't know it) (212): This ought to put a smile on your face unless it hits too close to home! [read]

  6. Spring starts you programming in pure XML! (211): I'm now very enthusiastic, some may say obsessed, about using the Spring Framework. Spring is certainly making me more productive. [read]

  7. Google's GWT Example: Interactive FIFA 2006 World Cup Application (203): Goto the application without reading the entry.A couple of weeks ago google released the Google Web Toolkit. The toolkit lets Java developers create AJAX application, [read]

  8. Testicular Cancer (198): I've just been diagnosed with testicular cancer. It's probably seminoma (which is relatively good). However, other things are now on my mind and so this blog will not be updated for a while. [read]

  9. Intel iMac - the best Java development box in the world (182): Cancelling my MacBook Pro order worked out really well for me. Firstly, I'm getting a new company laptop (a big grey Dull), and the prospect of carrying two laptops around just scares me. Second, [read]


Most read last week-end

  1. Joshua Bloch Shocking Confession: java.util.Arrays Is Broken (218): (This should be all over the net by now. I first saw it here. [read]

  2. Hibernate3, Annotations & Spring: the morning after (116): Ok, after fiddling with a few more jars, I seem to have gotten past my main obsticle. Now I'm back to the "common problems" phase of working with Hibernate. Still evaluating, [read]

  3. JBoss Seam: Make Spring inside (111): We here try out JBoss Seam, and find JBoss Microkernel, is this a IOC from JBoss? Why dont we use Spring instead? [read]

  4. Why are we still dealing with C++ vs Java (97): I had an interesting conversation this week with someone that believed that the Java world was filled with people who couldn’t code. In his opinion, [read]

  5. "Look Ma, no locks!" (97): Brian Goetz has written an excellent introductory article on nonblocking algorithms and showcases some simple nonblocking data structures with code examples and pictures. [read]

  6. 6 Ways of Setting Java Classpath (96): How to set Java classpath? List as many ways as you can. This can be an interesting Java job interview question. [read]

  7. Spring, JBoss and The Elephant (96): Interface 21 and BEA recently announced the release of Pitchfork, the EJB3 implementation within the Weblogic container, built using Spring. [read]

  8. Ubuntu 6.06 (94): I upgraded my Ubuntu box the other night. The process was pretty impressive. For one thing, it was significantly faster than an MSWindows or MacOS upgrade. [read]

  9. Integrating spring and GWT (84): I just completed my first shot at integrating Spring with GWT. You can check it out here Here’s what you do to expose a simple service. 1) Write the remote service, [read]


Is Java Flawed - a big advantage of Python/Ruby/(your favorite interpreted language)

Java is supposed to be much better to build big projects, because of static type checking, and all the rigour around the language. But how many of you have seen medium sized projects taking more than 30 minutes to build.

At work, they have a standard J2EE project, with only about 50 EJBs, hundreds of JDO classes, and standard classes. Between the JDO generation, EJB generation, EJB dependencies calculations, and packaging, it takes 20 minutes. And the project is not doing that much. One can optimize to avoid dependencies calculations and it would then take about 12 minutes. But still.

With Python, Ruby or your favorite interpreted language, this would be at most a few seconds to test a new version. Now I am a fan of Java and don’t enjoy that much programming in interpreted languages, especially since modern IDE like Eclipse do so much for you.

To sum up, Java is supposed to be the right language for large scale projects, and yet it is on large scale projects that compilation is an issue. Ok, this will enforces a better separation of concerns, and will probably be beneficial in the end. But I am not sure that most people do a good separation of concerns in projects they manage. I think it is no accident if the next step in Java development involves much more runtime compilation. It somehow started with JSPs, and then with the various XML config files combined to java reflection. It is now made even more popular with annotations, see EJB 3.0.

I am wondering a bit if one day compiled languages will be only a curiosity for most enterprise projects.

Using Linux to Recover Fucked Up Windows Data

Recently, one of my relatives computer under Windows XP, refused to boot. There was no way of fixing it with Windows Install CD as partition table seemed corrupt to Windows. I tried everything in an 2003 Ultimate Boot CD, but nothing worked out.

Someone gave me an install cd of Ubuntu Linux, and it managed to read the data. Well sometimes only. The erratic behaviour was due to a bad ATA cable. This probably was the cause of the corruption in the first place. Anyway with a new cable, Windows was still not able to read its data. But Ubuntu Linux, now working well, was able to, without having anything to configure (except mounting the drive).

So I copied the data, and reformated the NTFS partition, reinstalled Windows, recopied the data.

After this experience, I’d recommend to any Windows user to have a spare Ubuntu Linux Live CD, just in case your Windows corrupts itself.

Java HTML Parsing Example With htmlparser

Every week, I post javablogs top 10 most read blog entries on this blog. The reason for it was that I don't look at what's happening on the week-end and this will pickup interesting stories from the weekend, and I also don't watch javablogs everyday. Overall I find it quite good to be uptodate with interesting stuff happening on javablogs.

As mentionned in an earlier post my library of choice to do the parsing is htmlparser (on sourceforge) because it's free, open source and because I am lazy and did not want to do my own. If you know a better open source library, feel free to add a comment about it, I'll be glad to hear about it. htmlparser is not the easiest library to use, there are many entry points and it's not immediately clear which one to choose. So I post here how I used it if it can save a few minutes to people having to do this task.

  private static Entry parseEntry(String contentthrows ParserException
  {
    final Entry entry = new Entry();

    final NodeVisitor linkVisitor = new NodeVisitor() {
      
      @Override
      public void visitTag(Tag tag) {
        String name = tag.getTagName();

        if ("a".equalsIgnoreCase(name))
            {
              String hrefValue = tag.getAttribute("href");
              if (hrefValue != null && !hrefValue.startsWith("http://"))
              {
                if (!hrefValue.startsWith("/")) hrefValue = "/"+hrefValue;
                hrefValue = "http://javablogs.com"+hrefValue;
                //System.out.println("test, value="+hrefValue);
              }
              if (hrefValue != null)
              {
                hrefValue = hrefValue.replaceAll("&""&amp;");
                tag.setAttribute("href", hrefValue);                
              }
            }
      }
    
    };
    
    NodeVisitor visitor = new NodeVisitor() {

      @Override
      public void visitTag(Tag tag) {        
        String name = tag.getTagName();
            if ("span".equalsIgnoreCase(name|| "div".equalsIgnoreCase(name))
            {              
              String classValue = tag.getAttribute("class");
//                LOGGER.debug("visittag name="+name+" class="+classValue+"children="+tag.getChildren().toHtml());
              if ("blogentrydetails".equals(classValue))
              {
                Pattern countPattern = Pattern.compile("Reads:\\s*([0-9]*)");
                Matcher matcher = countPattern.matcher(tag.getChildren().toHtml());
                if (matcher.find())
                {
                  String countStr = matcher.group(1);
                  entry.count = new Integer(countStr).intValue();
                }
                
              }
              else if ("blogentrysummary".equals(classValue))
              {
                try
                {
                  tag.getChildren().visitAllNodesWith(linkVisitor);
                }
                catch (ParserException pe)
                {
                  LOGGER.error(pe,pe);
                }
                entry.description = tag.getChildren().toHtml();                 
                entry.description = entry.description.replaceAll("\\s+"" ");
              }
              else if ("blogentrytitle".equals(classValue))
              {
                try
                {
                  tag.getChildren().visitAllNodesWith(linkVisitor);
                }
                catch (ParserException pe)
                {
                  LOGGER.error(pe,pe);
                }
                entry.title =tag.getChildren().toHtml()
                entry.title = entry.title.replaceAll("\\s+"" ");
              }              
            }
            
      }

    };
    Parser parser = new Parser(new Lexer(new Page(content,"UTF-8")));
    parser.visitAllNodesWith(visitor);
        if (entry.title != null)
        {
          return entry;
        }
        else return null;
  }

Top 10 Most Read Last Week On Javablogs.com, Week 21


Most read last week

  1. Spring vs JBoss, and why I don’t care about Sun standards (272): After a long time, it was interesting to see the Spring and JBoss folks engage in a public war of words, in comments on Matt Raible’s blog. [read]

  2. Kent Beck: "We thought we were just programming on an airplane" (231): JUnit co-creator Kent Beck says a number of things convinced he and Erich Gamma to create a new revision of JUnit after a long hiatus, including TestNG and Java 5. Last week at JavaOne, [read]

  3. Where are you, Project Manager with Technical Skills? (204): In Spain we are facing again a lack of workers with experience in development of not-so-cutting-edge technologies like J2EE. So, [read]

  4. Thanks... and good luck Bruce! (203): It is unfortunate that Bruce Tate forgot to enable comments to his final blog entry. It would be a shame to see him off without at least a small well-wishing. (possibly a little roast too ;-) [read]

  5. Google Web Toolkit Angst (202): I've been using Google Web Toolkit for the last week or so. I'm really liking it, it is really productive and once you getting it working everything is sweet. The problem is, [read]

  6. Is this simpler than Hibernate? (193): In an earlier blog entry I described an early cut of DynaModel, Slingshot's persistence engine. [read]

  7. Article: Don't repeat the DAO! : Build a generic typesafe DAO with Hibernate and Spring AOP (192): Don't repeat the DAO! : Build a generic typesafe DAO with Hibernate and Spring AOP is a developerWorks article by Per Mellqvist which presents a generic DAO implementation class based on Hibernate, [read]

  8. Why ORM Tools are Not Recommended (185): Sandeep Sha has written an a forum posting by Why ORM Tools are Not Recommended that has some interesting points. Although I do not agree with all the points, [read]

  9. The Dojo Toolkit in Practice (185): We have posted a new article on using the Dojo Toolkit in a project. The article discusses a piece of a project that uses Ajax to create a responsive itinerary viewer. [read]


Most read last week-end

  1. Spring vs JBoss, and why I don’t care about Sun standards (272): After a long time, it was interesting to see the Spring and JBoss folks engage in a public war of words, in comments on Matt Raible’s blog. [read]

  2. Thanks... and good luck Bruce! (203): It is unfortunate that Bruce Tate forgot to enable comments to his final blog entry. It would be a shame to see him off without at least a small well-wishing. (possibly a little roast too ;-) [read]

  3. Is this simpler than Hibernate? (193): In an earlier blog entry I described an early cut of DynaModel, Slingshot's persistence engine. [read]

  4. What’s Up With Huge Resumes? (150): What’s up with huge resumes these days? The company I work for has been hiring lately and so I usually end up interviewing one to two people a week. [read]

  5. Introducing jvm-languages.com (147): Back in September of 2004, I tried to write a book. It would have been called Dynamic Languages and Java. Unfortunately, I never completed it. [read]

  6. Comparison Between PMD vs Findbugs vs Hammurapi (135): Take a look at this one the differences between these three tools Differences [read]

  7. Then God said let there be Ubuntu... ahem (130): Finally I got a version of Linux, which works as good as XP or even better ;) ; using which I can get to do my work seamlessly. Its none other than Ubuntu Dapper. [read]

  8. Job Trend, Not Google Trend (121): Wanna know the amount of Java jobs versus .Net jobs, or the growth of AJAX jobs? Google Trend may be able to help you a bit, but the result is not scoped for jobs only. Indeed. [read]

  9. 1-Minute Quiz: Why is Hyphen Illegal in Identifier? (110): Why is hyphen (-) an illegal char in Java identifier? Why can't we use variable names like first-name, as we do in xml files? The answer to this question is not hard, but the challenge is, [read]


Top 10 Most Read Last Week On Javablogs.com, Week 20


Most read last week

  1. The Worst Java Job Interview Questions. (269): Why are you looking for a job? Strictly speaking, this is not a java question, but it shows up in almost every job interview I've been to. [read]

  2. Goodbye Ant , Welcome Maven 2 (219): After years of using Ant for building my applications, I have moved to something different, Apache Maven 2. And now it seems there is no looking back. [read]

  3. Google Web Toolkit: A Brief Review (219): Google has released GWT - a java window toolkit which converts your java applications (using the toolkit API) to javascript (incl. AJAX) and HTML. [read]

  4. A *bold* paper against Threads (214): Edward A. Lee wrote a paper called "The Problem with Threads", you can find his pdf paper here. There is no rant here but facts, and sound reasoning. [read]

  5. Outsourcing your code is so cheap ... but why are so many jobs coming back from their indian trip ? (202): There are websites where you can get very cheap developpers, here are the one I know: http://www.getacoder.com/ http://www.rentacoder.com/ http://www.getafreelancer.com/ http://www. [read]

  6. Signs You're a Crappy Programmer (and don't know it) (190): Please read this great post from Damien Katz, and watch the signs Java is all you'll ever need. "Enterprisey" isn't a punchline to you. [read]

  7. Google Web Toolkit: Web Applications Just Got Harder (182): Oh the buzz. Oh the excitement. Oh the AJaX Gods has released their secret sauce with an Apache license. Google Web Toolkit allows one to develop AJaX web applications entirely in Java, [read]

  8. PDFs available for JavaOne 2006 Sessions (177): Check out the JavaOne 2006 Conference Session Catalog: “Presentation files available for download are indicated with a paperclip icon. After clicking on a paperclip, [read]

  9. Google Web Toolkit for building AJAX apps in Java (173): Google has introduced a toolkit for building AJAX applications in Java, though its in beta. It has also supplied some sample applications with the kit. [read]


Most read last week-end

  1. PDFs available for JavaOne 2006 Sessions (177): Check out the JavaOne 2006 Conference Session Catalog: “Presentation files available for download are indicated with a paperclip icon. After clicking on a paperclip, [read]

  2. Cringely: Why IBM Is in Trouble (159): Robert X. Cringley doesnt have a high opinion of IBM. Last week, he wrote, ...what is IBM? IBM is a disaster-in-the-making. [read]

  3. JavaOne Gossip: NetBeans Pulls a Prank on Eclipse (147): Humor makes life fun. Life just got a lot funnier. For some I guess. netBeans - Eclipse 1-0. Post your suggestions on how Eclipse should get even. [read]

  4. Day 5: McNealy, Gosling, Gage: "Forget the box" (139): With a mixture of sadness, relief, and hope for the future, former Sun CEO Scott McNealy took the stage this morning at the final keynote address of JavaOne 2006. [read]

  5. Project Harmony gets AWT/Swing Contrib from Intel (127): This may be a bit late but at JavaONE this year JEdit was shown running on the AWT/Swing contribution that Intel gave to Project Harmony. [read]

  6. Java 7.0 (Dolphin): Evolving in the Ecosystem (121): Sun developer Danny Coward says "Compatibility is king", but Sun is not staying still in the Java space. [read]

  7. This is genuine Microsoft (120): I started playing with Google Web Toolkit beta- actually I didn’t really start. Because I had to uninstall IE7 (which I don’t use at all), but hey I’d been curious. [read]

  8. Become a Java Champion, stay in useless Country, Learning Java for what? (120): I just thinking, what should we learn Java? Why dont use dotNet, I read Matt blog about his income US$ 200k more, or Mike Conan in OZ, that become the good best company. Today I just dont know, [read]

  9. jBixbe: a java tool I consider ... buying ! (90): I found this tool on Erik's linkblog, thanks to him ! [read]


Top 10 Most Read Last Week On Javablogs.com, Week 19


Most read last week

  1. Axis2: Why bother? (257): The Axis team is kicking up a big fuss about their recent release of Axis 2 (1.0!) Surprisingly, this library is so so abysmally bad, [read]

  2. Google trends proves: Java is doomed (251): Google trends is a nice idea, and I had to apply it adhoc to Java, Ruby, Python and C#. Interesting results, I can see a decline in Java! [read]

  3. Rich Open Source Webmail that doesn't suck (219): Guys...lets face it. Squirrel Mail... So check out our killer rich webmail. [read]

  4. Your Next Programming Language (216): Many people talk about how, as software developers, we should learn new programming languages frequently. [read]

  5. How to recognize a "Sacred Code" (210): You know you are dealing with a "sacred code" when you ask a previous developer (or the designer of the code) a question about the code and his immediate reply is .... [read]

  6. All you ever wanted to know about Workflow and how it relates to Java, Transactions and Concurrency (204): Read this blog carefully and you're in for a PAYRAISE. Workflow and business process technology will be essential in developing next generation applications. The knowledge about it is scarce. [read]

  7. Omg - I love this (Mac users may not) (203): This guy doesn't like Macs Damnation this is funny.... [read]

  8. 7 Reasons Why Web Apps Fail (179): Web applications are popping up faster and faster every day, and quite a few are using the power that Ajax offers to their advantage. [read]

  9. Scaling out 37 Signal-style applications is convenient (179): I had someone telling me that: Ruby can scale. Basecamp prooves that. Now, you all know that I do not think that Ruby has ANY problems with scaling. However, [read]


Most read last week-end

  1. Omg - I love this (Mac users may not) (203): This guy doesn't like Macs Damnation this is funny.... [read]

  2. How to Design a Good API (176): I was reading this presentation on the Design of API's by Joshua Bloch it talk's about how to design a good api but more importantly the reasons why doing certain things results in a good design. [read]

  3. JRuby on Rails Is Born (172): JavaOne attendees are in for a treat. Not only will they be receiving a DDJ issue which calls Rails a tipping-point to a new era in enterprise computing (or something like that)... [read]

  4. JavaOne day -1 : Bird Strike (147): The plan was to fly out from Sydney to San Francisco today. The plane was fueled, the travelers boarded. The aircraft taxied out to the runway, takeoff speed was reached, [read]

  5. 10 things i love about my Mac (125): A switcher Top 10 of nice things on Mac OS X: 1. The way programs live in the system (no registry shit) 2. The shell 3. Firewire boot capabilities 4. Apps like iChat, iSync and Addressbook 5. [read]

  6. YouTube bandwith usage/costs ... AMAZING ! (121): While looking for successfull video hosting I found this techcrunch article about youtube called Did YouTube Just Raise another $25 million? [read]

  7. 10 things i hate about my Mac (113): A switcher Top 10 of ugly issues with Apple Mac OS X: 1. No @ key in boot camp windows installation available 2. All banking programs on mac really suck 3. adv. [read]

  8. Commons Collections 3.2 Released (113): Commons Collections 3.2 has been released. Commons Collections is a library that builds upon the Java Collection Framework. It provides additional Map, [read]

  9. GoogleTrends : Java vs C# vs PHP (112): La comparaison est un poison. Ceci dit, comparer "l'intérêt" pour java, C# et PHP avec GoogleTrends, le nouveau service de Google, était très tentant... [read]


First Steps With EhCache

If you need to cache objects in your system, Ehcache is a simple cache written in Java, widely used and well tested. I will present here a short tutorial on how to use EhCache for people who don't want to look around the documentation at first, but just want to test if it works in their project and to see how easy it is to setup.
Installation
Download Ehcache from the Download link on http://ehcache.sourceforge.net. Current release is 1.2.
Unpack Ehcache with an unpacker that knows the tgz format. For unix users, it is trivial, for windows users, 7zip is a free (and open-source) unpacker. It is probably the most popular, but there are other ones like tugzip or izarc or winrar.
In your java project you need to have ehcache-1.2.jar, commons-collections-2.1.1.jar and commons-logging-1.0.4.jar (versions numbers may vary) in your classpath, those libraries are shipped with ehcache.
Cache Configuration
Write an ehcache.xml file where you describe what cache you want to use. There can be several files per project, several cache descriptions per file. I use here a persistent cache. Configuration file is well described at http://ehcache.sourceforge.net/documentation/configuration.html
<ehcache>
<cache name="firstcache" maxElementsInMemory="10000" eternal="false" overflowToDisk="true" timeToIdleSeconds="0" timeToLiveSeconds="0" diskPersistent="true" diskExpiryThreadIntervalSeconds="120"/>

</ehcache>

Code
static  {  
//Create a CacheManager using a specific config file
cacheManager = CacheManager.create(TestClass.class.getResource( "/config/ehcache.xml"));
cache = cacheManager.getCache("firstcache");
}

/**
* retrieves value from cache if exists
* if not create it and add it to cache */
public String doit(String key, String value) {
//get an element from cache by key
Element e = cache.get(key);
if (e != null) {
value = (String)e.getValue();
LOGGER.info("retrieved "+ value+" from cache ");
}
else {
value = "new value" ;
cache.put(new Element(key, value));
}
return value;
}

/**
* refresh value for given key */
public void refresh(String key) { cache.remove(key); }

/**
* to call eventually when your application is exiting */
public void shutdown() { cacheManager.shutdown(); }
Conclusion
Using EhCache is as simple a using a Java Map with an additional configuration file.

Null vs. Errors

I am not particularly a fan of  JCS (Jakarta Cache System) as I find ehcache code very clean and simple. But I have to say the author has some good comments on the site:
 
Nulls vs. Errors
 
I started to support ObjectNotFoundExceptions for failed gets but the overhead and cumbersome coding needed to surround a simple get method is ridiculous. Instead the JCS return null.
 
For having seen too many times the ObjectNotFoundException "pattern", I can only agree!
 

Previous

Next