HTML table extractor
- ID
- 2157
- Status
- summarized
- Published
- 30 Jun 2026, 7:38 AM
- Fetched
- 30 Jun 2026, 8:50 AM
- Provider
- Simon Willison
- Category
- developer-ai
- Original URL
- https://simonwillison.net/2026/Jun/29/html-table-extractor/
- Source URL
- https://simonwillison.net/atom/everything/
Summary
- Score
- 7.5
- Created
- 30 Jun 2026, 8:50 AM
- Tags
- Audience
- developersvibe_codersdatabase learnersAI/ML learnersAI agent users
What happened
Simon Willison's HTML Table Extractor is a paste-conversion tool that extracts tables from copied rich text and converts them to HTML, Markdown, CSV, TSV, or JSON. It's particularly handy for quickly grabbing structured data from web pages like Wikipedia lists. A related update improves his Rich text to markdown tool with table support.
Why it matters
Saves developers time when scraping or repurposing tabular data from websites, which is common when building dashboards, importing data into databases, or preprocessing for AI/ML. For Malaysian builders, it’s a practical utility for extracting data from local government portals, research reports, or any table-heavy site.
Discussion angle
How can tools like this be combined with AI agents to automate data collection from Malaysian public sector websites (e.g., extracting statistics from DOSM tables) and feed into dashboards or analysis pipelines?