Let's Build a Python Web Scraping Project from Scratch | Hands-On Tutorial

Jovian
Jovian
157.7 هزار بار بازدید - 3 سال پیش - Join our upcoming 20-week data
Join our upcoming 20-week data science boot camp: https://www.jovian.ai/data-analyst-bo...

💻 Web scraping is the process of extracting and parsing data from websites in an automated fashion using a computer program. It’s a useful technique for creating datasets for research and learning.

🔗 Resources used in the workshop:
- Rough notebook: https://jovian.ai/aakashns-6l3/scrapi...
- Final notebook: https://jovian.ai/aakashns-6l3/scrapi...
- Web scraping project guide:  https://jovian.ai/aakashns/python-web...
- Web scraping tutorial: https://jovian.ai/aakashns/python-web...

In this workshop, we’ll use Python and its ecosystem of libraries to scrape information from a website and create a dataset of CSV file(s).

Here are the steps we’ll follow to build a web scraping project from scratch:
✅ Pick a website and identify the information to be scraped into a CSV file
💾 Use the requests library to download web pages from the site programmatically
💬 Use Beautiful Soup to parse and extract information from web pages
📝 Create well-formatted CSV file(s) with the extracted information
✍ Document and share your work online in the form of a Jupyter notebook or blog post

⌚ Time Breaks:
Introduction 00:00
Problem Statement 7:03
Setting up Jupyter 15:09
Fetching pages with requests 25:26
Parsing pages with beautifulsoup 33:27
Saving data to CSV files 1:01:41
Scraping another page 1:03:10
Defining functions 1:11:28
Putting it together 1:30:35
Documentation 1:56:04
Publishing your notebook 2:26:13
Q&A 2:28:22

🎤 About the speaker
Aakash N S is the co-founder and CEO of Jovian - a community learning platform for data science & ML. Previously, Aakash has worked as a software engineer (APIs & Data Platforms) at Twitter in Ireland & San Francisco and graduated from the Indian Institute of Technology, Bombay. He’s also an avid blogger, open-source contributor, and online educator.

-
Learn Data Science the right way at https://www.jovian.ai
Interact with a global community of like-minded learners https://jovian.ai/forum/
Get the latest news and updates on Machine Learning at Twitter: jovianml
Connect with us professionally on LinkedIn: jovianml
Subscribe for new videos on Artificial Intelligence jovianml
3 سال پیش در تاریخ 1400/01/26 منتشر شده است.
157,794 بـار بازدید شده
... بیشتر