I wrote my previous post too fast. I found a very simple change that increases the speed x6!
The idea is too process messages in a ThreadPoolExecutor. As my Nodes are Runnable, I just needed to initialize a common ThreadPoolExecutor, and in a sendMessage, execute the runnable each time.
Here is the full code:
publicclass OptimizedRing {
private ExecutorService executor;
publicstaticvoid main(String[] args) throws Exception { OptimizedRing ring = new OptimizedRing(); RingNode node = ring.startRing(Integer.parseInt(args[0])); node.sendMessage(new StartMessage()); }
private Timer startTimer() { Timer timer = new Timer(); new Thread(timer).start(); return timer; }
private RingNode[] spawnNodes(int n, final Timer timer) { System.out.println("constructing nodes"); long start = System.currentTimeMillis(); executor = Executors.newFixedThreadPool(4); RingNode[] nodes = new RingNode[n+1]; for (int i = 0; i < n ; i++) { nodes[i] = new RingNode(i, timer, null); } long end = System.currentTimeMillis(); System.out.println("Took "+(end-start)+"ms to construct "+n+" nodes"); return nodes; }
privatevoid connectNodes(int n, RingNode[] nodes) { System.out.println("connecting nodes"); nodes[n] = nodes[0]; for (int i=0; i<n; i++) { nodes[i].connect(nodes[i+1]); } }
publicvoid sendMessage(Message m) { //we don't need to change this implementation as timer is rarely called queue.add(m); }
publicvoid run() { while (true) { Message m; try { m = queue.take(); if (m instanceof StartMessage) { startTime = System.currentTimeMillis(); timing = true; } elseif (m instanceof StopMessage) { long end = System.currentTimeMillis(); System.out.println("Start="+startTime+" Stop="+end+" Elapsed="+(end-startTime)); timing = false; } elseif (m instanceof CancelMessage) { break; } } catch (InterruptedException e) { e.printStackTrace(); } } } } }
Code
Spawn
Send 100M messages
Scala Actors
15ms
270104ms
SimpleRing
11ms
493073ms
OptimizedRing (4 threads)
6ms
84727ms
OptimizedRing (5+ threads)
5ms
62593ms
OptimizedRing (1 thread)
5ms
60660ms
I finally saw my 4 cores used! Max multithreaded throughput is achieved at 5 threads. However 1 thread is faster. Is this related to memory bandwith limit?
Now I am left wondering if actors are really that important if one can achieve much higher throughput using plain Java and very simple concepts (BlockingQueue, ThreadPoolExecutor). Worse, this test is actually faster with only 1 thread...
Alex Miller has a very interesting test of Actors. He finds out Scala performance is relatively low compared to Erlang, and Kilim is very near Erlang. But Kilim code is the most difficult to read in the lot.
I thought it would be simple to just do the same test in plain Java. I wrote the code for it duplicating the scala logic using Threads instead of Actors.
publicclassSimpleRing{publicstaticvoidmain(String[]args)throwsException{SimpleRingring=newSimpleRing();RingNodenode=ring.startRing(Integer.parseInt(args[0]));node.sendMessage(newStartMessage());}publicRingNodestartRing(intn){RingNode[]nodes=spawnNodes(n,startTimer());connectNodes(n,nodes);returnnodes[0];}privateTimerstartTimer(){Timertimer=newTimer();newThread(timer).start();returntimer;}privateRingNode[]spawnNodes(intn,finalTimertimer){System.out.println("constructing nodes");longstart=System.currentTimeMillis();RingNode[]nodes=newRingNode[n+1];for(inti=0;i<n;i++){nodes[i]=newRingNode(i,timer,null);newThread(nodes[i]).start();//later use pool}longend=System.currentTimeMillis();System.out.println("Took "+(end-start)+"ms to construct "+n+" nodes");returnnodes;}privatevoidconnectNodes(intn,RingNode[]nodes){System.out.println("connecting nodes");nodes[n]=nodes[0];for(inti=0;i<n;i++){nodes[i].connect(nodes[i+1]);}}interfaceMessage{StringgetType();}privatestaticclassStartMessageimplementsMessage{publicStringgetType(){return"START";}}privatestaticclassStopMessageimplementsMessage{publicStringgetType(){return"STOP";}}privatestaticclassCancelMessageimplementsMessage{publicStringgetType(){return"CANCEL";}}privatestaticclassTokenMessageimplementsMessage{privateintnodeId;privateintvalue;publicTokenMessage(intnodeId,intvalue){this.nodeId=nodeId;this.value=value;}publicStringgetType(){return"TOKEN";}}privatestaticclassRingNodeimplementsRunnable{privateintnodeId;privateTimertimer;privateRingNodenextNode;privateBlockingQueue<Message>queue=newLinkedBlockingQueue<Message>();publicRingNode(intid,Timertimer,RingNodenextNode){nodeId=id;this.timer=timer;this.nextNode=nextNode;}publicvoidconnect(RingNodenode){nextNode=node;}publicvoidsendMessage(Messagem){queue.add(m);}publicvoidrun(){while(true){try{Messagem=queue.take();if(minstanceofStartMessage){log("Starting messages");timer.sendMessage(m);nextNode.sendMessage(newTokenMessage(nodeId,0));}elseif(minstanceofStopMessage){log("Stopping");nextNode.sendMessage(m);break;}elseif(minstanceofTokenMessage){if(((TokenMessage)m).nodeId==nodeId){intnextValue=((TokenMessage)m).value+1;if(nextValue%10000==0){log("Around ring "+nextValue+" times");}if(nextValue==1000000){timer.sendMessage(newStopMessage());timer.sendMessage(newCancelMessage());nextNode.sendMessage(newStopMessage());break;}else{nextNode.sendMessage(newTokenMessage(nodeId,nextValue));}}else{nextNode.sendMessage(m);}}}catch(InterruptedExceptionie){ie.printStackTrace();}}}publicvoidlog(Strings){System.out.println(System.currentTimeMillis()+" "+nodeId+": "+s);}}privatestaticclassTimerimplementsRunnable{privateBlockingQueue<Message>queue=newLinkedBlockingQueue<Message>();privatebooleantiming=false;privatelongstartTime;publicvoidsendMessage(Messagem){queue.add(m);}publicvoidrun(){while(true){Messagem;try{m=queue.take();if(minstanceofStartMessage){startTime=System.currentTimeMillis();timing=true;}elseif(minstanceofStopMessage){longend=System.currentTimeMillis();System.out.println("Start="+startTime+" Stop="+end+" Elapsed="+(end-startTime));timing=false;}elseif(minstanceofCancelMessage){break;}}catch(InterruptedExceptione){e.printStackTrace();}}}}}
I was a bit surprised by the result. It was slow and only 1 thread was really active at one time. This is why the test is particularly good. It is not trivial to reproduce the functionality in plain Java in an effective manner. It really shows how the concept of Actors can be useful.
We saw in a previous entry how one has to be careful with Double.NaN. Today we will see how regular double can cause problems. By the way the NaN issue was not Java specific and this issue is also general in different programming languages.
A coworker was shocked that in Java (I was a bit surprised he saw that only today, but it is true it can be surprising that such a simple thing does not work as expected):
408.16 - 40.82 = 367.34000000000003
In C, this would lead to the same result. This is all due to the binary represention of double numbers. Using the formula 2^(exponent)*1.mantissa where mantissa is on 52 bits, we have
We round to 52 bits, the mantissa is 0x9828F5C28F5C3 = 2676827028518339.
As a decimal, the internal value is (2676827028518339/2^52+1) * 256 = 408.1600000000000250111042987555265426635742
40.82 decomposition:
exponent = 32. Then 40.82/32= 1.275625 = 1 + 0x468F5C28F5C28F5C...*2^-52
Rounded to 52 bits, the mantissa is 0x468F5C28F5C29 = 1241304647293993
As a decimal, the internal value is (1241304647293993/2^52 + 1)*32 = 40.8200000000000002842170943040400743484497
The difference in decimal becomes 367.34000000000002472... which becomes 367.34000000000003 when represented in binary (to convince yourself you can apply the same technique).
The Solution
One solution to this problem is to use java.math.BigDecimal which stores a number as 2 integers, one for the digits, one for the exponent power of 10 (and not 2). The correct code would become: BigDecimal value = BigDecimal.valueOf(408.16).subtract(BigDecimal.valueOf(40.82));
value would then be 367.34.
But BigDecimal has also many potential for bugs. For example, you should never use the constructor taking a double but always the one taking a String.
new BigDecimal(408.16) = 408.16000000000002501110429875552654266357421875
This is because of the binary representation of 408.16 as a double. 408.16 is only an approximation of 408.16!
Another trick with BigDecimal is not to use equals(Object) but compareTo(Object) because 408.160 is not equal to 408.16 using equals.
Why Could not They Make it Work With Double?
If you were too lazy to follow the steps of the explanation. There is a simpler explanation. Imagine the representation of a number in base 3 with 2 "digits". Let's imagine 1/3 is represented as 0.1 (this is a very simple number representation) 1/3+1/3+1/3 becomes 0.1+0.1+0.1 = 1.0 (in base 3) = 1.0 if we convert to base 10. Now in base 10, 1/3 can only be represented as 0.3, so 1/3+1/3+1/3 = 0.3+0.3+0.3 = 0.9 <> 1.0. So BigDecimal is only interesting to handle ... decimals! In the enterprise world, this should be most apps. It is a bit sad it appeared so late in the JDK. It should really be a primitive type.
My brother just sent me a funny quote. I don’t know if it is true or not:
entwickeln Sie lieber überzeugende Lösungen anstatt viele Stunden mit Coding zu verbringen? Ist Ihnen die Produktivität Ihres Teams wichtig?
Mark Driver, VP Research von Gartner, kommentierte kürzlich
“Here’s a simple equation. In terms of mental fortitude…
1 Smalltalk developer = 2.5 C++ developers
1 C++ developer = 1.5 Java developers”.
You don’t need german to understand. Of course it can not be true. How can anyone measure mental fortitude? And how does it related with productivity is another issue.
If you want a VCS that is written in C++, go play with Monotone. Really.They use a “real database”. They use “nice object-oriented libraries”. They use “nice C++ abstractions”. And quite frankly, as a result of all these design decisions that sound so appealing to some CS people, the end result is a horrible and unmaintainable mess.
This however is quite interesting since I am probably not the only one to have seen the disastrous effects of too enthusiastic abstraction professionals in Java. This is why many Java projects are utter crap. But again, not everybody makes this mistake. Some people know how to build good things. As the result of Java being popular, we have many crap programmers with pseudo genius theories you don’t find in C or Haskell.
On another subject, I seriously wonder why we don’t have more distributed compilation in Java as in C/C++. I am tired of seing those core doing nothing while compiling.
I don'tknowwhat Sun had in mindwhencreatingDouble.NaNnumber. Itisveryinintuitiveto use. I am sure every single developer out therefell in thetrapoftryingtofind out if a double wasNaN or notusing:
Double.NaN == myDouble
Thisdoesnotwork (I don'tknowtherealreasonwhy), onehasto use:
Looking out at some old post. I found out I was not far from the truth in January 2008 when I stated:“In 2008 the Ruby On Rails mentality will continue to prevail. In the Java world, Grails is the most likely to benefit from it. (…) It could also be something based around Spring as their current MVC solution is not very good and very old fashioned."I don’t think I will be right with the provocative Java is dead. A post recently titled No Future In Java makes some good points about where the future of Java still is: the web applications. Grails is probably today the best contender in the Java world, far ahead from the others, and it leverages the Java developers. However I am not sure one can say that RIA is a fad, or that RIA will only be done in a super powerful browser in the future. Microsoft might be a game changer here. The real advantages of the browser application so far are: easy deployment (the killer argument IMHO), “simple” security. I would not be surprised if in the near future, Microsoft advertises a solution for RIA easily deployed, based on standard protocols (HTTP?), a bit like IBM does with Eclipse RIA, but much more ambitious.
These days, I have the feeling that Java is dead. Even if, or maybe because I have used Java so much in the past 10 years, I have this feeling.
In 1998 Java was revolutionary. It was a very simple to learn object oriented language with modern concepts and familiar syntax. Furthermore the standard library had neat features like internet networking and it could be integrated in the browser. All this at a time when the internet just started to be popular.
Today we have very few evolutions, a huge library (with lots of useful and useless stuff in). Some good stuff has been added like concurrent utils, but not many things changed overall. Open source languages like Python are much more dynamic in their library maintenance. The language does not seem to provide anything interesting when compared to the alternatives, like .NET or even with the “scripting languages” like Python. In the browser it has failed where Flash has succeeded.
Lots of things are too complicated to build in Java when compared to other languages. I feel Swing, database access (JDBC), JSP could be vastly improved to help developer productivity. Why is ORM less important in the Microsoft world? because the standard database layer of Microsoft is not as crappy as JDBC. Why don’t they have tons of web frameworks? because ASP.NET is decent, does more than JSP and does not get too much in your way at the same time. Microsoft finds the right balance between library complexity, power and developer friendlyness.
Browser applications are less popular, and desktop apps integrated to a “3-tier” architecture more popular. Java on the desktop is really weak. Give me .NET or QT anytime. There are still no big Java desktop apps on everyday people desktops except Eclipse (IBM has really done an impressive job with it). It is almost 2009 and I still have no Java app except the dev environment for my Java programmer job on my Linux desktop. I know that in my everyday job, I would be more productive with a .NET environment, just because Java sucks so much on the client side. Borland Delphi was more productive 10 years ago!
Java on the Mobile is a failure. Almost nobody uses it and is plagged with compatibility problems. However there is hope here, with Android from Google.
The only advantage of Java compared to .NET is that it is free. You have Tomcat, Glassfish for free. You can deploy on Linux. If you are a poor developer that’s quite an advantage. But most company pay for Java, they want the “security” of an IBM and they deploy on Windows machines. It does not make sense, those companies should buy the better Microsoft stack instead of IBM. And I am sure more and more will. Vista might be the big Microsoft failure, I am sure it will be fixed with Windows 7, and Microsoft dev tools are just getting better and better.
Scala, Groovy, JRuby don’t fix anything, they are just toy programming languages and are based on the JVM, on the Java libraries. In the lot, Scala does better because it has the concept of library, and they do try to build more interesting libraries than Sun. But it is too complex to be ever popular.
All the open source libraries in Java are fine but who needs to choose between 20 web frameworks, 5 loggers, etc.. There are very few really useful ones: hibernate, lucene, jmeter, junit.
If Java has no logical place in most companies, if it does not provide anything more than the alternatives, and is very weak on the desktop, what’s left to Java? the code base and the developers? That’s about it. It sounds a lot like Cobol in the early 90s. Java is dead.
It has been now a week since I have installed ArchLinux on my home computer. I daily use Ubuntu 8.10 at work.
Since the Ubuntu upgrade from 8.04 to 8.10 I have had problems with my Xorg settings. I just found out the nvidia-settings utility does not manage to save the configuration anymore. So I have to lookup on google and try to fix it. And that annoys me. That annoys me because the promess of Ubuntu is that everything works out of the box. In reality, you have to mess with the configuration as much as with ArchLinux.
There are 2 negative points of ArchLinux when compared to Ubuntu:
The install on a new computer takes a lot of time (not the 30min of Ubuntu) to have a decent desktop running. It can only be done by people ready to fiddle with config files0. But it is well documented in the Arch wiki. So ArchLinux is definately not newbie oriented.
Some proprietary software might not be installed easily. For a long time Oracle was not trivial to install. Now there is an AUR file for it, so it is quite simple.
Now the positive side:
good KDE 4.1.3 available
more up-to-date packages
“transparent” updates - no big breaking the system at each release.
learn to use the useful configuration files. They are not many to use in ArchLinux. One feels much more in control on what’s installed and what’s happening. They are not many config files to know in the end. Configuration ends up being no more difficult (for someone not addicted to point and click) than in Ubuntu.
fast boot
no crap forced upon you, for example PulseAudio. I have less problems with pure ALSA.
does not disappoint. You know you have to fiddle with the config from the start.
I tried another silly thing with Linux, ArchLinux. The setup is quite rough as you have to edit many config files manually. But if you know a bit your way around it takes only a few hours to have everything running well. The installation manual on the wiki is detailed enough to correct all eventual mistakes humans do.
I decided to try once more KDE 4 on it, as at first it was just a silly experiment: I was really not sure ArchLinux would be workable. In the end I am pleasantly surprised, KDE 4.1.3 is way way better than any other versions of KDE I have tried before. It is stable and quite pretty. It took the team a lot of time to get there but now I think KDE 4 is a very good window manager, pleasant to use.
It’s a big change from older versions which were too unstable/had too few features to be of any use.
I am not convinced with ArchLinux compared to Ubuntu. The setup is much more complex, less packages are available. True you learn a bit more with ArchLinux. We will see if it can keep working well for a few years.
I have read a blog post a few days ago about someone thinking a good programmer interview question was:
How does a hash table work?
While it is a very interesting question, I doubt many programmers (even relatively good ones) can answer that question. If I look back and think of all the employees in all the companies I have known, I can count on one hand people that can answer that question. I can think of 3 or 4 I met in one company, and maybe another 1 or 2 in different companies. And I don’t think anyone would have been able to go deeper in the details like mentioning closed-addressed vs open-addressed possible implementations.
I am so negative, because a question about some important details of the inner working of the Java HashMap was raised at work a week ago. I was the only one (because I had read several times about hash tables) to be aware that the equals method of the key object was called every time you do a table.get(xxx) or a table.put(xxx, yyy). Others thought only the hashCode() method was used.
This kind of interview question creates a high bias towards people coming straight out of school if they have Hashtable in their program. For people with more experience, it is highly likely if they ever read about it that they forgot the details (and maybe more than the details).
This can seem shocking as hash tables are used almost everywhere these days, but it’s a reality.