Lawyer contact list – web page scraper
Requirement:Create web scrapers to scrape multiple sites and extract the contact detail information of over 20,000 lawyers
Background: Client requested that we create a web scraper to extract the contact information of over 20,000 lawyers from 5 different web sites. While most of the contact information data was viewable on the web pages, some of the contact data, like the lawyer’s email address, was contained in Vcard files. A Vcard files is the file is used to store contact information for Microsoft Outlook.
Solution: We created a web scraper that not only gathered the contact information for each lawyer but also automatically downloaded the Vcard files and extracted the pertinant contact information from the Vcard files. All the contact information was then exported to an Excel spreadsheet with hyperlinks to the original web pages in the event that our client needed to reference the original source of the web scrapers extracted data.
Alternative applications for web scraper: Using the same method to extract the data from the various web sites containing lawyer contact information, we’re able to spider and extract contact information data from any site. The desired extract list can be a contact directory list, but can likewise be a compeditor’s pricing list, a product description list, or a vendor’s product list. Regardless of the level of complexity by which the web site is programmed, if you can view the data, we can create web scrapers to extract it, and export it in a format that you want.