1. Multi-URL training
Train the same extractor with multiple different pages. When a website displays different data variations on the same page types you want to train against all variations.
2. Auto-optimize extractors
Whenever you save your extractor, Import.io will automatically optimize the extractors to run in the shortest time possible.
3. URL generator
Use patterns such as page numbers and category names to automatically generate all of the URLs that you need in seconds.
4. Multiple pages
Extract data from multiple pages. We automatically detect paginated lists, or you can explicitly click on the “next” page to help us learn.
5. Website screenshots
Import.io helps ensure compliance and accuracy by allowing you to capture and save screen shots of every page from where you extracted the data. This is a feature is easily accessible and useful as it creates an audit-able record of the extracted data.
6. Data behind a login
Authenticated extraction allows you to get data that is only available after logging into a website. You provide the appropriate credentials and Import.io will do the rest.
7. Download images and files
Download images and documents along with all the web data in one run. Retailers pull product images from manufacturers, data scientists build training sets for computer vision.
8. Easy scheduling
Set up your web data extraction to run “on the regular” using pre-set or custom schedules: weekly, daily, hourly, whatever your business needs. Set it and forget it.
9. Interactive workflows
Record sequences of the actions that you need to perform on a website. For example, you may need to navigate between pages, enter a search term or change a default sort order on a list.