Java HTML Parsing Example With htmlparser

Every week, I post javablogs top 10 most read blog entries on this blog. The reason for it was that I don't look at what's happening on the week-end and this will pickup interesting stories from the weekend, and I also don't watch javablogs everyday. Overall I find it quite good to be uptodate with interesting stuff happening on javablogs.

As mentionned in an earlier post my library of choice to do the parsing is htmlparser (on sourceforge) because it's free, open source and because I am lazy and did not want to do my own. If you know a better open source library, feel free to add a comment about it, I'll be glad to hear about it. htmlparser is not the easiest library to use, there are many entry points and it's not immediately clear which one to choose. So I post here how I used it if it can save a few minutes to people having to do this task.

  private static Entry parseEntry(String contentthrows ParserException
  {
    final Entry entry = new Entry();

    final NodeVisitor linkVisitor = new NodeVisitor() {
      
      @Override
      public void visitTag(Tag tag) {
        String name = tag.getTagName();

        if ("a".equalsIgnoreCase(name))
            {
              String hrefValue = tag.getAttribute("href");
              if (hrefValue != null && !hrefValue.startsWith("http://"))
              {
                if (!hrefValue.startsWith("/")) hrefValue = "/"+hrefValue;
                hrefValue = "http://javablogs.com"+hrefValue;
                //System.out.println("test, value="+hrefValue);
              }
              if (hrefValue != null)
              {
                hrefValue = hrefValue.replaceAll("&""&");
                tag.setAttribute("href", hrefValue);                
              }
            }
      }
    
    };
    
    NodeVisitor visitor = new NodeVisitor() {

      @Override
      public void visitTag(Tag tag) {        
        String name = tag.getTagName();
            if ("span".equalsIgnoreCase(name|| "div".equalsIgnoreCase(name))
            {              
              String classValue = tag.getAttribute("class");
//                LOGGER.debug("visittag name="+name+" class="+classValue+"children="+tag.getChildren().toHtml());
              if ("blogentrydetails".equals(classValue))
              {
                Pattern countPattern = Pattern.compile("Reads:\\s*([0-9]*)");
                Matcher matcher = countPattern.matcher(tag.getChildren().toHtml());
                if (matcher.find())
                {
                  String countStr = matcher.group(1);
                  entry.count = new Integer(countStr).intValue();
                }
                
              }
              else if ("blogentrysummary".equals(classValue))
              {
                try
                {
                  tag.getChildren().visitAllNodesWith(linkVisitor);
                }
                catch (ParserException pe)
                {
                  LOGGER.error(pe,pe);
                }
                entry.description = tag.getChildren().toHtml();                 
                entry.description = entry.description.replaceAll("\\s+"" ");
              }
              else if ("blogentrytitle".equals(classValue))
              {
                try
                {
                  tag.getChildren().visitAllNodesWith(linkVisitor);
                }
                catch (ParserException pe)
                {
                  LOGGER.error(pe,pe);
                }
                entry.title =tag.getChildren().toHtml()
                entry.title = entry.title.replaceAll("\\s+"" ");
              }              
            }
            
      }

    };
    Parser parser = new Parser(new Lexer(new Page(content,"UTF-8")));
    parser.visitAllNodesWith(visitor);
        if (entry.title != null)
        {
          return entry;
        }
        else return null;
  }

Top 10 Most Read Last Week On Javablogs.com, Week 21


Most read last week

  1. Spring vs JBoss, and why I don’t care about Sun standards (272): After a long time, it was interesting to see the Spring and JBoss folks engage in a public war of words, in comments on Matt Raible’s blog. [read]

  2. Kent Beck: "We thought we were just programming on an airplane" (231): JUnit co-creator Kent Beck says a number of things convinced he and Erich Gamma to create a new revision of JUnit after a long hiatus, including TestNG and Java 5. Last week at JavaOne, [read]

  3. Where are you, Project Manager with Technical Skills? (204): In Spain we are facing again a lack of workers with experience in development of not-so-cutting-edge technologies like J2EE. So, [read]

  4. Thanks... and good luck Bruce! (203): It is unfortunate that Bruce Tate forgot to enable comments to his final blog entry. It would be a shame to see him off without at least a small well-wishing. (possibly a little roast too ;-) [read]

  5. Google Web Toolkit Angst (202): I've been using Google Web Toolkit for the last week or so. I'm really liking it, it is really productive and once you getting it working everything is sweet. The problem is, [read]

  6. Is this simpler than Hibernate? (193): In an earlier blog entry I described an early cut of DynaModel, Slingshot's persistence engine. [read]

  7. Article: Don't repeat the DAO! : Build a generic typesafe DAO with Hibernate and Spring AOP (192): Don't repeat the DAO! : Build a generic typesafe DAO with Hibernate and Spring AOP is a developerWorks article by Per Mellqvist which presents a generic DAO implementation class based on Hibernate, [read]

  8. Why ORM Tools are Not Recommended (185): Sandeep Sha has written an a forum posting by Why ORM Tools are Not Recommended that has some interesting points. Although I do not agree with all the points, [read]

  9. The Dojo Toolkit in Practice (185): We have posted a new article on using the Dojo Toolkit in a project. The article discusses a piece of a project that uses Ajax to create a responsive itinerary viewer. [read]


Most read last week-end

  1. Spring vs JBoss, and why I don’t care about Sun standards (272): After a long time, it was interesting to see the Spring and JBoss folks engage in a public war of words, in comments on Matt Raible’s blog. [read]

  2. Thanks... and good luck Bruce! (203): It is unfortunate that Bruce Tate forgot to enable comments to his final blog entry. It would be a shame to see him off without at least a small well-wishing. (possibly a little roast too ;-) [read]

  3. Is this simpler than Hibernate? (193): In an earlier blog entry I described an early cut of DynaModel, Slingshot's persistence engine. [read]

  4. What’s Up With Huge Resumes? (150): What’s up with huge resumes these days? The company I work for has been hiring lately and so I usually end up interviewing one to two people a week. [read]

  5. Introducing jvm-languages.com (147): Back in September of 2004, I tried to write a book. It would have been called Dynamic Languages and Java. Unfortunately, I never completed it. [read]

  6. Comparison Between PMD vs Findbugs vs Hammurapi (135): Take a look at this one the differences between these three tools Differences [read]

  7. Then God said let there be Ubuntu... ahem (130): Finally I got a version of Linux, which works as good as XP or even better ;) ; using which I can get to do my work seamlessly. Its none other than Ubuntu Dapper. [read]

  8. Job Trend, Not Google Trend (121): Wanna know the amount of Java jobs versus .Net jobs, or the growth of AJAX jobs? Google Trend may be able to help you a bit, but the result is not scoped for jobs only. Indeed. [read]

  9. 1-Minute Quiz: Why is Hyphen Illegal in Identifier? (110): Why is hyphen (-) an illegal char in Java identifier? Why can't we use variable names like first-name, as we do in xml files? The answer to this question is not hard, but the challenge is, [read]


Top 10 Most Read Last Week On Javablogs.com, Week 20


Most read last week

  1. The Worst Java Job Interview Questions. (269): Why are you looking for a job? Strictly speaking, this is not a java question, but it shows up in almost every job interview I've been to. [read]

  2. Goodbye Ant , Welcome Maven 2 (219): After years of using Ant for building my applications, I have moved to something different, Apache Maven 2. And now it seems there is no looking back. [read]

  3. Google Web Toolkit: A Brief Review (219): Google has released GWT - a java window toolkit which converts your java applications (using the toolkit API) to javascript (incl. AJAX) and HTML. [read]

  4. A *bold* paper against Threads (214): Edward A. Lee wrote a paper called "The Problem with Threads", you can find his pdf paper here. There is no rant here but facts, and sound reasoning. [read]

  5. Outsourcing your code is so cheap ... but why are so many jobs coming back from their indian trip ? (202): There are websites where you can get very cheap developpers, here are the one I know: http://www.getacoder.com/ http://www.rentacoder.com/ http://www.getafreelancer.com/ http://www. [read]

  6. Signs You're a Crappy Programmer (and don't know it) (190): Please read this great post from Damien Katz, and watch the signs Java is all you'll ever need. "Enterprisey" isn't a punchline to you. [read]

  7. Google Web Toolkit: Web Applications Just Got Harder (182): Oh the buzz. Oh the excitement. Oh the AJaX Gods has released their secret sauce with an Apache license. Google Web Toolkit allows one to develop AJaX web applications entirely in Java, [read]

  8. PDFs available for JavaOne 2006 Sessions (177): Check out the JavaOne 2006 Conference Session Catalog: “Presentation files available for download are indicated with a paperclip icon. After clicking on a paperclip, [read]

  9. Google Web Toolkit for building AJAX apps in Java (173): Google has introduced a toolkit for building AJAX applications in Java, though its in beta. It has also supplied some sample applications with the kit. [read]


Most read last week-end

  1. PDFs available for JavaOne 2006 Sessions (177): Check out the JavaOne 2006 Conference Session Catalog: “Presentation files available for download are indicated with a paperclip icon. After clicking on a paperclip, [read]

  2. Cringely: Why IBM Is in Trouble (159): Robert X. Cringley doesnt have a high opinion of IBM. Last week, he wrote, ...what is IBM? IBM is a disaster-in-the-making. [read]

  3. JavaOne Gossip: NetBeans Pulls a Prank on Eclipse (147): Humor makes life fun. Life just got a lot funnier. For some I guess. netBeans - Eclipse 1-0. Post your suggestions on how Eclipse should get even. [read]

  4. Day 5: McNealy, Gosling, Gage: "Forget the box" (139): With a mixture of sadness, relief, and hope for the future, former Sun CEO Scott McNealy took the stage this morning at the final keynote address of JavaOne 2006. [read]

  5. Project Harmony gets AWT/Swing Contrib from Intel (127): This may be a bit late but at JavaONE this year JEdit was shown running on the AWT/Swing contribution that Intel gave to Project Harmony. [read]

  6. Java 7.0 (Dolphin): Evolving in the Ecosystem (121): Sun developer Danny Coward says "Compatibility is king", but Sun is not staying still in the Java space. [read]

  7. This is genuine Microsoft (120): I started playing with Google Web Toolkit beta- actually I didn’t really start. Because I had to uninstall IE7 (which I don’t use at all), but hey I’d been curious. [read]

  8. Become a Java Champion, stay in useless Country, Learning Java for what? (120): I just thinking, what should we learn Java? Why dont use dotNet, I read Matt blog about his income US$ 200k more, or Mike Conan in OZ, that become the good best company. Today I just dont know, [read]

  9. jBixbe: a java tool I consider ... buying ! (90): I found this tool on Erik's linkblog, thanks to him ! [read]


Top 10 Most Read Last Week On Javablogs.com, Week 19


Most read last week

  1. Axis2: Why bother? (257): The Axis team is kicking up a big fuss about their recent release of Axis 2 (1.0!) Surprisingly, this library is so so abysmally bad, [read]

  2. Google trends proves: Java is doomed (251): Google trends is a nice idea, and I had to apply it adhoc to Java, Ruby, Python and C#. Interesting results, I can see a decline in Java! [read]

  3. Rich Open Source Webmail that doesn't suck (219): Guys...lets face it. Squirrel Mail... So check out our killer rich webmail. [read]

  4. Your Next Programming Language (216): Many people talk about how, as software developers, we should learn new programming languages frequently. [read]

  5. How to recognize a "Sacred Code" (210): You know you are dealing with a "sacred code" when you ask a previous developer (or the designer of the code) a question about the code and his immediate reply is .... [read]

  6. All you ever wanted to know about Workflow and how it relates to Java, Transactions and Concurrency (204): Read this blog carefully and you're in for a PAYRAISE. Workflow and business process technology will be essential in developing next generation applications. The knowledge about it is scarce. [read]

  7. Omg - I love this (Mac users may not) (203): This guy doesn't like Macs Damnation this is funny.... [read]

  8. 7 Reasons Why Web Apps Fail (179): Web applications are popping up faster and faster every day, and quite a few are using the power that Ajax offers to their advantage. [read]

  9. Scaling out 37 Signal-style applications is convenient (179): I had someone telling me that: Ruby can scale. Basecamp prooves that. Now, you all know that I do not think that Ruby has ANY problems with scaling. However, [read]


Most read last week-end

  1. Omg - I love this (Mac users may not) (203): This guy doesn't like Macs Damnation this is funny.... [read]

  2. How to Design a Good API (176): I was reading this presentation on the Design of API's by Joshua Bloch it talk's about how to design a good api but more importantly the reasons why doing certain things results in a good design. [read]

  3. JRuby on Rails Is Born (172): JavaOne attendees are in for a treat. Not only will they be receiving a DDJ issue which calls Rails a tipping-point to a new era in enterprise computing (or something like that)... [read]

  4. JavaOne day -1 : Bird Strike (147): The plan was to fly out from Sydney to San Francisco today. The plane was fueled, the travelers boarded. The aircraft taxied out to the runway, takeoff speed was reached, [read]

  5. 10 things i love about my Mac (125): A switcher Top 10 of nice things on Mac OS X: 1. The way programs live in the system (no registry shit) 2. The shell 3. Firewire boot capabilities 4. Apps like iChat, iSync and Addressbook 5. [read]

  6. YouTube bandwith usage/costs ... AMAZING ! (121): While looking for successfull video hosting I found this techcrunch article about youtube called Did YouTube Just Raise another $25 million? [read]

  7. 10 things i hate about my Mac (113): A switcher Top 10 of ugly issues with Apple Mac OS X: 1. No @ key in boot camp windows installation available 2. All banking programs on mac really suck 3. adv. [read]

  8. Commons Collections 3.2 Released (113): Commons Collections 3.2 has been released. Commons Collections is a library that builds upon the Java Collection Framework. It provides additional Map, [read]

  9. GoogleTrends : Java vs C# vs PHP (112): La comparaison est un poison. Ceci dit, comparer "l'intérêt" pour java, C# et PHP avec GoogleTrends, le nouveau service de Google, était très tentant... [read]


First Steps With EhCache

If you need to cache objects in your system, Ehcache is a simple cache written in Java, widely used and well tested. I will present here a short tutorial on how to use EhCache for people who don't want to look around the documentation at first, but just want to test if it works in their project and to see how easy it is to setup.
Installation
Download Ehcache from the Download link on http://ehcache.sourceforge.net. Current release is 1.2.
Unpack Ehcache with an unpacker that knows the tgz format. For unix users, it is trivial, for windows users, 7zip is a free (and open-source) unpacker. It is probably the most popular, but there are other ones like tugzip or izarc or winrar.
In your java project you need to have ehcache-1.2.jar, commons-collections-2.1.1.jar and commons-logging-1.0.4.jar (versions numbers may vary) in your classpath, those libraries are shipped with ehcache.
Cache Configuration
Write an ehcache.xml file where you describe what cache you want to use. There can be several files per project, several cache descriptions per file. I use here a persistent cache. Configuration file is well described at http://ehcache.sourceforge.net/documentation/configuration.html
<ehcache>
<cache name="firstcache" maxElementsInMemory="10000" eternal="false" overflowToDisk="true" timeToIdleSeconds="0" timeToLiveSeconds="0" diskPersistent="true" diskExpiryThreadIntervalSeconds="120"/>

</ehcache>

Code
static  {  
//Create a CacheManager using a specific config file
cacheManager = CacheManager.create(TestClass.class.getResource( "/config/ehcache.xml"));
cache = cacheManager.getCache("firstcache");
}

/**
* retrieves value from cache if exists
* if not create it and add it to cache */
public String doit(String key, String value) {
//get an element from cache by key
Element e = cache.get(key);
if (e != null) {
value = (String)e.getValue();
LOGGER.info("retrieved "+ value+" from cache ");
}
else {
value = "new value" ;
cache.put(new Element(key, value));
}
return value;
}

/**
* refresh value for given key */
public void refresh(String key) { cache.remove(key); }

/**
* to call eventually when your application is exiting */
public void shutdown() { cacheManager.shutdown(); }
Conclusion
Using EhCache is as simple a using a Java Map with an additional configuration file.

Null vs. Errors

I am not particularly a fan of  JCS (Jakarta Cache System) as I find ehcache code very clean and simple. But I have to say the author has some good comments on the site:
 
Nulls vs. Errors
 
I started to support ObjectNotFoundExceptions for failed gets but the overhead and cumbersome coding needed to surround a simple get method is ridiculous. Instead the JCS return null.
 
For having seen too many times the ObjectNotFoundException "pattern", I can only agree!
 

Algorithms in Java (Third Edition) Book Review

The book Algorithms in Java is huge, but unlike the usual huge books, the content is very interesting. It can be used as reference material, or as toilet book (to learn things while you're wasting time in the toilets).

You will learn simple things, like what is the "raison d'être" of linked lists. The author gives very good examples to illustrate his propositions. He explains through the sieve of erathostene and through Josephus problem the advantages of arrays or linked list.

You will learn step by step everything that is to be known in algorithms. Recursion, divide and conquer, Tree knowledge will be useful for the later sorting and searching chapters.

The chapter on Hashing will make you understand very clearly why the source of String.hashCode() is
public int hashCode() {
  int h = hash;
  if (h == 0) {
     int off = offset;
     char val[] = value;
     int len = count;
     for (int i = 0; i <  len; i++) h = 31*h + val[off ++];
     hash = h;
  }
  return h;
} 

There might be too much info on different types of sorting algorithms and the book becomes there more a reference book than anything else. But overall, you will learn plenty with this book. It is very well written, complete, and will refresh one's memory. I find it useful to read back things I learnt after a few years as I then have a very different view of the subject, and I pay closer attention to some details I completely missed the first time (sometimes).
Tags:

Caching HTTP Responses in Java

Caching HTTP response can dramatically improve performance of your app if what you generate is in reality not very dynamic. There are many free caching frameworks in Java. Most popular seem to be ehcache, oscache, jcs and JBoss Cache.

ehcache is quite simple to use and its code is clean. They have a CachingFilter that you can put in your webapp server to cache transparently HTTP responses. However as the framework only allows you to store Objects (which makes sense for most uses), I was wondering how they cached the HttpResponse which is a stream. I was a bit disappointed by the answer, they just create a copy ByteArrayOutputStream and call toBytes() to store it in the cache. While this is optimal for a memory cache store (the whole response will anyway be in the cache, although I am not sure if they check for particularly big responses to avoid caching those or try to cache those) I don’t think it is that good for a disk cache store.

Ideally one would like the response to be stored using a buffer, to avoid having the whole response in memory. This would enable a much higher concurrent use. I think it is doable by writing your own CachingFilter and by using the concurrent utils Queue to block writing when the buffer is full.

I googled for this kind of stuff without success. I only found solutions similar to ehcache one (for example sun CachingResponseWrapper and CachingFilter or oscache CacheFilter (a bit more careful, but still a toBytes())) I wonder why it is not already done and public.

Javablogs.com 2005 Top 20

Most read in 2005

  1. New in Hibernate 3: Criteria API enhancements (816): Projection, aggregation, subselects, detatched criterias - its all there in the Hibernate 3 Criteria API. Let me show you some examples, starting with the new projection API.… [read]

  2. Hello, IDEA! (555): From the recent Java IDE discussions, it seems like there's a good portion of Java programmers who don't know IntelliJ IDEA, or simply haven't tried it yet. In this short screencast,… [read]

  3. Hey Gosling: This is why we don’t use Java 5.0 yet! (461): Whenever there is a major JDK update everyone on the Sun Microsystems side seems to love to beat the drum of upgrading.… [read]

  4. MSN7.5?? (450): ?????Google Talk????????????????????????????????????Google Talk???????? Google Talk????????????????????????????????????   [read]

  5. The worst code I've ever seen. Yes, that's true. (422): I've been in this business for 25 years and have been programming since the mid seventies. Even when I was a young pup, full of bright ideas and hubris and those around me were at least as bad,… [read]

  6. JSP is officially dead (407): Well, it looks like with the Final Draft of Java EE 5, the final nail has been placed in JSP's coffin.… [read]

  7. My wife is hot and she can code, too (400): My wife Keri loves puzzle games--tetris, scrabble, crosswords, text twist--any game where you have to figure something out, she's on it. She has a degree in CS, and is employed as a UI specialist,… [read]

  8. IBM announced SOMA - Service-Oriented Modelling and Architecture (396): [read]

  9. G-mail runs on Tomcat???!!!!! (390): Well, hello guys and gals...haven't been in the Java blog scene recently, but am glad to know Java is going strong as ever (not that I ever doubted it,… [read]

  10. Web Services articles (374): [read]

  11. Bill Gates tries Firefox (362): Tim Weber of the BBC reporter rustles up Bill Gates quote of 2005: Bill Gates is one of the people with Firefox on his computer, so I asked him for his opinion. I played around with it a bit,… [read]

  12. Language Oriented Programming: Everything is a Language (361): Some people don't 'get' Language Oriented Programming. It's a different perspective. Once you make the mental shift, everything starts to fall into place. Over the past few months,… [read]

  13. The killer app for Web 2.0 has arrived (356): Time to throw in the towel, 37 Signals. The future has arrived: iClock (via flocksucks) [read]

  14. New Search Engine Blows Google Away (356): After months of incredibly secret development, PreviewSeek Limited has launched the PreviewSeek search engine. My initial impression? It blows google away with its far more powerful searches. [read]

  15. Death to Apache (356): So our Apache heros have now decided that it isn't quite enough to prove to the world that they are abysmal failures at producing a J2EE container,… [read]

  16. WebSphere 6.0 System Management Enhancements (356): [read]

  17. It's Official, Struts is History! (352): [read]

  18. What Steve isn't telling us (348): So the rumors were true, Apple is really switching to Intel.  There are a lot of interesting things in Steve Jobs' keynote, as usual, but the most interesting part is,... [read]

  19. RE: Why I Ditched Hibernate (346): I saw this post and couldn't help but respond. The post's author, Bruce, is ditching Hibernate and Spring b/c he wants to use a connection pool (configured in Tomcat) instead.… [read]

How To Use Java With Blogger: A Tutorial

Blogger has a REST API. I use it to retrieve particular posts or to post transformed data. There is no Java API that I know of, but you will see here it is not very difficult to interface with Blogger API in Java using plain old XML.

Using libraries commons-httpclient and DOM4J it would be quite easy to implement your own Java Blogger API as the following code will suggest.

Authenticate
All requests need to be authenticated and are done in HTTPS. I use common-httpclient to perform requests. Here is how to setup the client:
private  HttpClient initHttpClient()
{
HttpClient client = new HttpClient();
List authPrefs = new ArrayList(2);
authPrefs.add(AuthPolicy.DIGEST );
authPrefs.add(AuthPolicy.BASIC);
client.getParams().setParameter (AuthPolicy.AUTH_SCHEME_PRIORITY, authPrefs);
client.getParams().setAuthenticationPreemptive(true);
Credentials defaultcreds = new UsernamePasswordCredentials(user, password);
client.getState().setCredentials(new AuthScope( "www.blogger.com", 443, AuthScope.ANY_REALM), defaultcreds);
return client;
}


Get Your Posts
To retrieve the posts, you just have to query the right url, and parse the XML response. I prefer to use DOM4J, because of its handy asXML() method to print a node as XML. For simplicity I use a Map to store an XML entry.

public Collection getPosts() throws  HttpException, IOException, ParserConfigurationException, SAXException, DocumentException 
{
GetMethod get = new GetMethod("https://www.blogger.com/atom" +"/"+blogId);
int statusCode = client.executeMethod(get);
if (statusCode != HttpStatus.SC_OK)
{
throw new RuntimeException(" Could not make HTTP request properly: " +get.getStatusLine());
}
InputStream response = get.getResponseBodyAsStream();
SAXReader reader = new SAXReader();
Document doc = reader.read(response);
Collection posts = new ArrayList();
List entries = doc.getRootElement().elements("entry");
if (LOG.isDebugEnabled())
{
LOG.debug("found "+entries.size()+" entries");
}
for (int i = 0; i <entries.size();i ++)
{
Element entry = (Element) entries.get(i);
Map m = new HashMap();
for (Iterator it = entry.elementIterator();it. hasNext();)
{
Element detail = (Element) it.next();
String name = detail.getName();
if (name.equals("link"))
{
m.put("link ",detail.attribute("href").getValue());
}
else if (name. equals("content"))
{
m.put("content",detail.asXML());
}
else
{
m.put(name,detail.getTextTrim());
}
}
posts.add(m);
if (LOG.isDebugEnabled())
{
LOG.debug( "found="+m.get(" title")+", url= "+m.get("link"));
}
}
return posts;
}

Create XML for a new Post
Nothing particular here, just XML production.
private String createXmlForCreatePost(String postTitle, String postContent) throws   IOException, DocumentException
{
SAXReader xmlReader = new SAXReader();
xmlReader.setValidation(false );
Document doc = DocumentHelper.createDocument();
QName rootName = DocumentHelper.createQName("entry", new Namespace("", "http://purl.org/atom/ns# "));
Element root = doc.addElement(rootName);
Element title = root.addElement("title");
title.addAttribute("mode"," escaped");
title.addAttribute("type ","text/plain");
title.setText (postTitle);
Element generator = root.addElement("generator ");
generator.addAttribute("url" , "http://31416.org");
generator. setText("31416 Java Generator ");
Element content = root.addElement("content ");
content.addAttribute("type" , "application/xhtml+xml");
//Element div = content.addElement(DocumentHelper.createQName("div",new Namespace("","http://www.w3.org/1999/xhtml")));
//div.add(...); //YOUR XHTML HERE
StringWriter result = new StringWriter();
XMLWriter writer = new XMLWriter (result);
writer.write(doc);
writer.close();
return result.toString();
}

That's it


Previous

Next