[TriEmbed] Pointers for attempting very basic Web Scraping?

Alan Porter alan at alanporter.com
Wed Mar 12 20:07:43 CDT 2014


Before attempting to scrape and parse the newsletter HTML, you should
read this epic post on Stack Overflow about HTML parsing.

RegEx match open tags except XHTML self-contained tags
http://stackoverflow.com/questions/1732348/#1732454

You could also volunteer to help with the publishing of the newsletter,
so that others might actually read it without the frustration, too. 
Structure it as a blog with an RSS feed.  Heck, since this is TMSA, you
could probably make it a class project.

Alan







More information about the TriEmbed mailing list