[TriEmbed] Pointers for attempting very basic Web Scraping?
Alan Porter
alan at alanporter.com
Wed Mar 12 20:07:43 CDT 2014
Before attempting to scrape and parse the newsletter HTML, you should
read this epic post on Stack Overflow about HTML parsing.
RegEx match open tags except XHTML self-contained tags
http://stackoverflow.com/questions/1732348/#1732454
You could also volunteer to help with the publishing of the newsletter,
so that others might actually read it without the frustration, too.
Structure it as a blog with an RSS feed. Heck, since this is TMSA, you
could probably make it a class project.
Alan
More information about the TriEmbed
mailing list