Step 2: Pagination, to locate the next page button on the page(you are currently on ) Step 7: No next page button is located on the last page and the workflow ends Example 2 - Click a list of elements on the web page and extract data from the detail page Step 6: Continue to extract data from the loop, and click the next page button until Octoparse gets to the last page Step 5: Click to Paginate, to click on the next page button to go to Step 4: Extract Data, to extract the needed data from the list of the elements Step 3: Loop Item, to locate the list of elements on the page Step 2: Pagination, to locate the next page button on the page (you are currently on ) Step 1: Go to Web Page, to open the target web page Let's take a look at some examples.Įxample 1 - Extract from a list of elements to get data The steps of the workflow should always be read from top to bottom, and from inside to outside for nested actions. In either case, it is strongly recommended that you grasp the basics of the workflow so you can scrape more precisely and accurately.Ī workflow consists of a list of actions that are put together in a specific order to scrap the target web data. In some cases, you may not need to modify the auto-created workflow yet, in other cases, you may need to build/troubleshoot the workflow manually if things are not working as expected.
![octoparse pagelength octoparse pagelength](https://www.filepanda.cc/Image/File-Transfer/Cyberduck/Screenshots/cyberduck-5.jpg)
This workflow, however, is created automatically by Octoparse while you are interacting with the built-in browser.
#Octoparse pagelength series
When you are building a scraping task in Octoparse, you are essentially creating a scraping workflow that can be translated into a series of instructions for Octoparse to follow through. The simulated scraping process is identical to how you'd access the web data in any everyday browser.ġ.2 Octoparse scrapes data automatically through workflow Actions like opening web pages, clicking page elements, clicking the next page button, or scrolling down the page can all be done in Octoparse.
![octoparse pagelength octoparse pagelength](http://inthebookstand.com/bookstandpublishing.com/wp-content/uploads/2013/04/page-size.jpg)
Octoparse works by simulating human browsing behaviors on its built-in browser. How Octoparse works to extract web data 1.1 Octoparse simulates human browsing behaviors