Readability Introduces New “Iris” Parser  

From the Readability blog:

Once the content type is determined, there’s still the complex task of knowing precisely what to tease out of a web resource. Even web articles—Readability’s wheelhouse—are comprised of much more than just a headline and body text. With Iris, Readability gains the ability to glean a whole new level of insight into what facets of a web resource matters to readers and developers: titles and headlines. Subheadlines. Lead images. Videos. Excerpts. Authors. Languages. Captions. Beyond just a great end-user experience, Iris represents a powerful bridge to the new ways content is being consumed beyond the browser.

Obviously, web content isn’t just about text anymore, and it sounds like Readability has been optimizing its engine to better “understand” article layouts and media types.

I’ll be giving this a try in the next few days and report back.

 
65
Kudos
 
65
Kudos

Now read this

A Writing Experiment, Part II (Hello Svbtle)

Back in January, when I published my first post for Ticci.org, I wrote that the site was meant to be an experiment to see how a more “personal blog” would fit in my workflow, which is largely based on writing all day for MacStories. With... Continue →