Ultimate Python Cheat Sheet: Practical Python For Everyday Tasks (part 2)
Оглавление:
· Working With Scikit-Learn Library (Machine Learning)
· Working With Plotly Library (Interactive Data Visualization)
· Working With Dates and Times
· Working With More Advanced List Comprehensions and Lambda Functions
· Working With Object Oriented Programming
· Working With Regular Expressions
Working With Scikit-Learn Library (Machine Learning)
1. Loading a Dataset
To work with datasets for your ML experiments
from sklearn import datasets iris = datasets.load_iris() X, y = iris.data, iris.target
2. Splitting Data into Training and Test Sets
To divide your data, dedicating portions to training and evaluation:
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
3. Training a Model
Training a ML Model using RandomForestClassifier:
from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier() model.fit(X_train, y_train)
4. Making Predictions
To access the model predictions:
predictions = model.predict(X_test)
5. Evaluating Model Performance
To evaluate your model, measuring its accuracy in prediction:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, predictions)
print(f"Model accuracy: {accuracy}")
6. Using Cross-Validation
To use Cross-Validation:
from sklearn.model_selection import cross_val_score
scores = cross_val_score(model, X, y, cv=5)
print(f"Cross-validation scores: {scores}")
7. Feature Scaling
To create the appropriate scales of your features, allowing the model to learn more effectively:
from sklearn.preprocessing import StandardScaler scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test)
8. Parameter Tuning with Grid Search
To refine your model’s parameters, seeking the optimal combination:
from sklearn.model_selection import GridSearchCV
param_grid = {'n_estimators': [10, 50, 100], 'max_depth': [None, 10, 20]}
grid_search = GridSearchCV(model, param_grid, cv=5)
grid_search.fit(X_train, y_train)
9. Pipeline Creation
To streamline your data processing and modeling steps, crafting a seamless flow:
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('classifier', RandomForestClassifier())
])
pipeline.fit(X_train, y_train)
10. Saving and Loading a Model
To preserve your model:
import joblib
# Saving the model
joblib.dump(model, 'model.joblib')
# Loading the model
loaded_model = joblib.load('model.joblib')
Working With Plotly Library (Interactive Data Visualization)
1. Creating a Basic Line Chart
To create a line chart:
import plotly.graph_objs as go import plotly.io as pio x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25] fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines')) pio.show(fig)
2. Creating a Scatter Plot
To create a scatter plot:
fig = go.Figure(data=go.Scatter(x=x, y=y, mode='markers')) pio.show(fig)
3. Creating a Bar Chart
To Create a Bar Chart:
categories = ['A', 'B', 'C', 'D', 'E'] values = [10, 20, 15, 30, 25] fig = go.Figure(data=go.Bar(x=categories, y=values)) pio.show(fig)
4. Creating a Pie Chart
To create a Pie Chart:
labels = ['Earth', 'Water', 'Fire', 'Air'] sizes = [25, 35, 20, 20] fig = go.Figure(data=go.Pie(labels=labels, values=sizes)) pio.show(fig)
5. Creating a Histogram
To create a Histogram:
data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] fig = go.Figure(data=go.Histogram(x=data)) pio.show(fig)
6. Creating Box Plots
To create a Box Plot:
data = [1, 2, 2, 3, 4, 4, 4, 5, 5, 6] fig = go.Figure(data=go.Box(y=data)) pio.show(fig)
7. Creating Heatmaps
To create a heatmap:
import numpy as np z = np.random.rand(10, 10) # Generate random data fig = go.Figure(data=go.Heatmap(z=z)) pio.show(fig)
8. Creating 3D Surface Plots
To create a 3D Surface Plot:
z = np.random.rand(20, 20) # Generate random data fig = go.Figure(data=go.Surface(z=z)) pio.show(fig)
9. Creating Subplots
To create a subplot:
from plotly.subplots import make_subplots fig = make_subplots(rows=1, cols=2) fig.add_trace(go.Scatter(x=x, y=y, mode='lines'), row=1, col=1) fig.add_trace(go.Bar(x=categories, y=values), row=1, col=2) pio.show(fig)
10. Creating Interactive Time Series
To work with Time Series:
import pandas as pd
dates = pd.date_range('20230101', periods=5)
values = [10, 11, 12, 13, 14]
fig = go.Figure(data=go.Scatter(x=dates, y=values, mode='lines+markers'))
pio.show(fig)
Working With Dates and Times
1. Getting the Current Date and Time
To get the current data and time:
from datetime import datetime
now = datetime.now()
print(f"Current date and time: {now}")
2. Creating Specific Date and Time
To conjure a moment from the past or future, crafting it with precision:
specific_time = datetime(2023, 1, 1, 12, 30)
print(f"Specific date and time: {specific_time}")
3. Formatting Dates and Times
Formatting Dates and Times:
formatted = now.strftime("%Y-%m-%d %H:%M:%S")
print(f"Formatted date and time: {formatted}")
4. Parsing Dates and Times from Strings
Parsing Dates and Times from Strings:
date_string = "2023-01-01 15:00:00"
parsed_date = datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")
print(f"Parsed date and time: {parsed_date}")
5. Working with Time Deltas
To traverse the distances between moments, leaping forward or backward through time:
from datetime import timedelta
delta = timedelta(days=7)
future_date = now + delta
print(f"Date after 7 days: {future_date}")
6. Comparing Dates and Times
Date and Times comparisons:
if specific_time > now:
print("Specific time is in the future.")
else:
print("Specific time has passed.")
7. Extracting Components from a Date/Time
To extract dates year, month, day, and more:
year = now.year
month = now.month
day = now.day
hour = now.hour
minute = now.minute
second = now.second
print(f"Year: {year}, Month: {month}, Day: {day}, Hour: {hour}, Minute: {minute}, Second: {second}")
8. Working with Time Zones
To work with time zones honoring the local time:
from datetime import timezone, timedelta
utc_time = datetime.now(timezone.utc)
print(f"Current UTC time: {utc_time}")
# Adjusting to a specific timezone (e.g., EST)
est_time = utc_time - timedelta(hours=5)
print(f"Current EST time: {est_time}")
9. Getting the Weekday
To identify the day of the week:
weekday = now.strftime("%A")
print(f"Today is: {weekday}")
10. Working with Unix Timestamps
To converse with the ancient epochs, translating their count from the dawn of Unix:
timestamp = datetime.timestamp(now)
print(f"Current timestamp: {timestamp}")
# Converting a timestamp back to a datetime
date_from_timestamp = datetime.fromtimestamp(timestamp)
print(f"Date from timestamp: {date_from_timestamp}")
Working With More Advanced List Comprehensions and Lambda Functions
1. Nested List Comprehensions
To work with nested list Comprehensions:
matrix = [[j for j in range(5)] for i in range(3)] print(matrix) # Creates a 3x5 matrix
2. Conditional List Comprehensions
To filter elements that meet your criteria:
filtered = [x for x in range(10) if x % 2 == 0] print(filtered) # Even numbers from 0 to 9
3. List Comprehensions with Multiple Iterables
To merge and transform elements from multiple sources in a single dance:
pairs = [(x, y) for x in [1, 2, 3] for y in [3, 1, 4] if x != y] print(pairs) # Pairs of non-equal elements
4. Using Lambda Functions
To summon anonymous functions, ephemeral and concise, for a single act of magic:
square = lambda x: x**2 print(square(5)) # Returns 25
5. Lambda Functions in List Comprehensions
To employ lambda functions within your list comprehensions:
squared = [(lambda x: x**2)(x) for x in range(5)] print(squared) # Squares of numbers from 0 to 4
6. List Comprehensions for Flattening Lists
To flatten a nested list, spreading its elements into a single dimension:
nested = [[1, 2, 3], [4, 5], [6, 7]] flattened = [x for sublist in nested for x in sublist] print(flattened)
7. Applying Functions to Elements
To apply a transformation function to each element:
import math transformed = [math.sqrt(x) for x in range(1, 6)] print(transformed) # Square roots of numbers from 1 to 5
8. Using Lambda with Map and Filter
To map and filter lists:
mapped = list(map(lambda x: x**2, range(5))) filtered = list(filter(lambda x: x > 5, mapped)) print(mapped) # Squares of numbers from 0 to 4 print(filtered) # Elements greater than 5
9. List Comprehensions with Conditional Expressions
List Comprehensions with Condidtional Expressions:
conditional = [x if x > 2 else x**2 for x in range(5)] print(conditional) # Squares numbers less than or equal to 2, passes others unchanged
10. Complex Transformations with Lambda
To conduct intricate transformations, using lambda functions:
complex_transformation = list(map(lambda x: x**2 if x % 2 == 0 else x + 5, range(5))) print(complex_transformation) # Applies different transformations based on even-odd condition
Working With Object Oriented Programming
1. Defining a Class
Creating a class:
class Wizard:
def __init__(self, name, power):
self.name = name
self.power = power
def cast_spell(self):
print(f"{self.name} casts a spell with power {self.power}!")
2. Creating an Instance
To create an instance of your class:
merlin = Wizard("Merlin", 100)
3. Invoking Methods
To call methods on instance of class:
merlin.cast_spell()
4. Inheritance
Subclassing:
class ArchWizard(Wizard):
def __init__(self, name, power, realm):
super().__init__(name, power)
self.realm = realm
def summon_familiar(self):
print(f"{self.name} summons a familiar from the {self.realm} realm.")
5. Overriding Methods
To overide base classes:
class Sorcerer(Wizard):
def cast_spell(self):
print(f"{self.name} casts a powerful dark spell!")
6. Polymorphism
To interact with different forms through a common interface:
def unleash_magic(wizard):
wizard.cast_spell()
unleash_magic(merlin)
unleash_magic(Sorcerer("Voldemort", 90))
7. Encapsulation
To use information hiding:
class Alchemist:
def __init__(self, secret_ingredient):
self.__secret = secret_ingredient
def reveal_secret(self):
print(f"The secret ingredient is {self.__secret}")
8. Composition
To assemble Objects from simpler ones:
class Spellbook:
def __init__(self, spells):
self.spells = spells
class Mage:
def __init__(self, name, spellbook):
self.name = name
self.spellbook = spellbook
9. Class Methods and Static Methods
To bind actions to the class itself or liberate them from the instance, serving broader purposes:
class Enchanter:
@staticmethod
def enchant(item):
print(f"{item} is enchanted!")
@classmethod
def summon(cls):
print("A new enchanter is summoned.")
10. Properties and Setters
To elegantly manage access to an entity’s attributes, guiding their use and protection:
class Elementalist:
def __init__(self, element):
self._element = element
@property
def element(self):
return self._element
@element.setter
def element(self, value):
if value in ["Fire", "Water", "Earth", "Air"]:
self._element = value
else:
print("Invalid element!")
Working With Decorators
1. Basic Decorator
To create a simple decorator that wraps a function:
def my_decorator(func):
def wrapper():
print("Something is happening before the function is called.")
func()
print("Something is happening after the function is called.")
return wrapper
@my_decorator
def say_hello():
print("Hello!")
say_hello()
2. Decorator with Arguments
To pass arguments to the function within a decorator:
def my_decorator(func):
def wrapper(*args, **kwargs):
print("Before call")
result = func(*args, **kwargs)
print("After call")
return result
return wrapper
@my_decorator
def greet(name):
print(f"Hello {name}")
greet("Alice")
3. Using functools.wraps
To preserve the metadata of the original function when decorating:
from functools import wraps
def my_decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
"""Wrapper function"""
return func(*args, **kwargs)
return wrapper
@my_decorator
def greet(name):
"""Greet someone"""
print(f"Hello {name}")
print(greet.__name__) # Outputs: 'greet'
print(greet.__doc__) # Outputs: 'Greet someone'
4. Class Decorator
To create a decorator using a class:
class MyDecorator:
def __init__(self, func):
self.func = func
def __call__(self, *args, **kwargs):
print("Before call")
self.func(*args, **kwargs)
print("After call")
@MyDecorator
def greet(name):
print(f"Hello {name}")
greet("Alice")
5. Decorator with Arguments
To create a decorator that accepts its own arguments:
def repeat(times):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for _ in range(times):
func(*args, **kwargs)
return wrapper
return decorator
@repeat(3)
def say_hello():
print("Hello")
say_hello()
6. Method Decorator
To apply a decorator to a method within a class:
def method_decorator(func):
@wraps(func)
def wrapper(self, *args, **kwargs):
print("Method Decorator")
return func(self, *args, **kwargs)
return wrapper
class MyClass:
@method_decorator
def greet(self, name):
print(f"Hello {name}")
obj = MyClass()
obj.greet("Alice")
7. Stacking Decorators
To apply multiple decorators to a single function:
@my_decorator
@repeat(2)
def greet(name):
print(f"Hello {name}")
greet("Alice")
8. Decorator with Optional Arguments
Creating a decorator that works with or without arguments:
def smart_decorator(arg=None):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
if arg:
print(f"Argument: {arg}")
return func(*args, **kwargs)
return wrapper
if callable(arg):
return decorator(arg)
return decorator
@smart_decorator
def no_args():
print("No args")
@smart_decorator("With args")
def with_args():
print("With args")
no_args()
with_args()
9. Class Method Decorator
To decorate a class method:
class MyClass:
@classmethod
@my_decorator
def class_method(cls):
print("Class method called")
MyClass.class_method()
10. Decorator for Static Method
To decorate a static method:
class MyClass:
@staticmethod
@my_decorator
def static_method():
print("Static method called")
MyClass.static_method()
Working With GraphQL
1. Setting Up a GraphQL Client
To work with GraphQL:
from gql import gql, Client from gql.transport.requests import RequestsHTTPTransport transport = RequestsHTTPTransport(url='https://your-graphql-endpoint.com/graphql') client = Client(transport=transport, fetch_schema_from_transport=True)
2. Executing a Simple Query
Executing a Query:
query = gql('''
{
allWizards {
id
name
power
}
}
''')
result = client.execute(query)
print(result)
3. Executing a Query with Variables
Query with Variables:
query = gql('''
query GetWizards($element: String!) {
wizards(element: $element) {
id
name
}
}
''')
params = {"element": "Fire"}
result = client.execute(query, variable_values=params)
print(result)
4. Mutations
To create and execute a mutation:
mutation = gql('''
mutation CreateWizard($name: String!, $element: String!) {
createWizard(name: $name, element: $element) {
wizard {
id
name
}
}
}
''')
params = {"name": "Gandalf", "element": "Light"}
result = client.execute(mutation, variable_values=params)
print(result)
5. Handling Errors
Error handling:
from gql import gql, Client
from gql.transport.exceptions import TransportQueryError
try:
result = client.execute(query)
except TransportQueryError as e:
print(f"GraphQL Query Error: {e}")
6. Subscriptions
Working with Subscriptions:
subscription = gql('''
subscription {
wizardUpdated {
id
name
power
}
}
''')
for result in client.subscribe(subscription):
print(result)
7. Fragments
Working with Fragments:
query = gql('''
fragment WizardDetails on Wizard {
name
power
}
query {
allWizards {
...WizardDetails
}
}
''')
result = client.execute(query)
print(result)
8. Inline Fragments
To tailor the response based on the type of the object returned:
query = gql('''
{
search(text: "magic") {
__typename
... on Wizard {
name
power
}
... on Spell {
name
effect
}
}
}
''')
result = client.execute(query)
print(result)
9. Using Directives
To dynamically include or skip fields in your queries based on conditions:
query = gql('''
query GetWizards($withPower: Boolean!) {
allWizards {
name
power @include(if: $withPower)
}
}
''')
params = {"withPower": True}
result = client.execute(query, variable_values=params)
print(result)
10. Batching Requests
To combine multiple operations into a single request, reducing network overhead:
from gql import gql, Client
from gql.transport.requests import RequestsHTTPTransport
transport = RequestsHTTPTransport(url='https://your-graphql-endpoint.com/graphql', use_json=True)
client = Client(transport=transport, fetch_schema_from_transport=True)
query1 = gql('query { wizard(id: "1") { name } }')
query2 = gql('query { allSpells { name } }')
results = client.execute([query1, query2])
print(results)
Working With Regular Expressions
1. Basic Pattern Matching
To find a match for a pattern within a string:
import re
text = "Search this string for patterns."
match = re.search(r"patterns", text)
if match:
print("Pattern found!")
2. Compiling Regular Expressions
To compile a regular expression for repeated use:
pattern = re.compile(r"patterns") match = pattern.search(text)
3. Matching at the Beginning or End
To check if a string starts or ends with a pattern:
if re.match(r"^Search", text):
print("Starts with 'Search'")
if re.search(r"patterns.$", text):
print("Ends with 'patterns.'")
4. Finding All Matches
To find all occurrences of a pattern in a string:
all_matches = re.findall(r"t\w+", text) # Finds words starting with 't' print(all_matches)
5. Search and Replace (Substitution)
To replace occurrences of a pattern within a string:
replaced_text = re.sub(r"string", "sentence", text) print(replaced_text)
6. Splitting a String
To split a string by occurrences of a pattern:
words = re.split(r"\s+", text) # Split on one or more spaces print(words)
7. Escaping Special Characters
To match special characters literally, escape them:
escaped = re.search(r"\bfor\b", text) # \b is a word boundary
8. Grouping and Capturing
To group parts of a pattern and extract their values:
match = re.search(r"(\w+) (\w+)", text)
if match:
print(match.group()) # The whole match
print(match.group(1)) # The first group
9. Non-Capturing Groups
To define groups without capturing them:
match = re.search(r"(?:\w+) (\w+)", text)
if match:
print(match.group(1)) # The first (and only) group
10. Lookahead and Lookbehind Assertions
To match a pattern based on what comes before or after it without including it in the result:
lookahead = re.search(r"\b\w+(?= string)", text) # Word before ' string'
lookbehind = re.search(r"(?<=Search )\w+", text) # Word after 'Search '
if lookahead:
print(lookahead.group())
if lookbehind:
print(lookbehind.group())
11. Flags to Modify Pattern Matching Behavior
To use flags like re.IGNORECASE to change how patterns are matched:
case_insensitive = re.findall(r"search", text, re.IGNORECASE) print(case_insensitive)
12. Using Named Groups
To assign names to groups and reference them by name:
match = re.search(r"(?P<first>\w+) (?P<second>\w+)", text)
if match:
print(match.group('first'))
print(match.group('second'))
13. Matching Across Multiple Lines
To match patterns over multiple lines using the re.MULTILINE flag:
multi_line_text = "Start\nmiddle end" matches = re.findall(r"^m\w+", multi_line_text, re.MULTILINE) print(matches)
14. Lazy Quantifiers
To match as few characters as possible using lazy quantifiers (*?, +?, ??):
html = "<body><h1>Title</h1></body>"
match = re.search(r"<.*?>", html)
if match:
print(match.group()) # Matches '<body>'
15. Verbose Regular Expressions
To use re.VERBOSE for more readable regular expressions:
pattern = re.compile(r"""
\b # Word boundary
\w+ # One or more word characters
\s # Space
""", re.VERBOSE)
match = pattern.search(text)
Working With Strings
1. Concatenating Strings
To join strings together:
greeting = "Hello" name = "Alice" message = greeting + ", " + name + "!" print(message)
2. String Formatting with str.format
To insert values into a string template:
message = "{}, {}. Welcome!".format(greeting, name)
print(message)
3. Formatted String Literals (f-strings)
To embed expressions inside string literals (Python 3.6+):
message = f"{greeting}, {name}. Welcome!"
print(message)
4. String Methods — Case Conversion
To change the case of a string:
s = "Python" print(s.upper()) # Uppercase print(s.lower()) # Lowercase print(s.title()) # Title Case
5. String Methods — strip, rstrip, lstrip
To remove whitespace or specific characters from the ends of a string:
s = " trim me " print(s.strip()) # Both ends print(s.rstrip()) # Right end print(s.lstrip()) # Left end
6. String Methods — startswith, endswith
To check the start or end of a string for specific text:
s = "filename.txt"
print(s.startswith("file")) # True
print(s.endswith(".txt")) # True
7. String Methods — split, join
To split a string into a list or join a list into a string:
s = "split,this,string"
words = s.split(",") # Split string into list
joined = " ".join(words) # Join list into string
print(words)
print(joined)
8. String Methods — replace
To replace parts of a string with another string:
s = "Hello world"
new_s = s.replace("world", "Python")
print(new_s)
9. String Methods — find, index
To find the position of a substring within a string:
s = "look for a substring"
position = s.find("substring") # Returns -1 if not found
index = s.index("substring") # Raises ValueError if not found
print(position)
print(index)
10. String Methods — Working with Characters
To process individual characters in a string:
s = "characters"
for char in s:
print(char) # Prints each character on a new line
11. String Methods — isdigit, isalpha, isalnum
To check if a string contains only digits, alphabetic characters, or alphanumeric characters:
print("123".isdigit()) # True
print("abc".isalpha()) # True
print("abc123".isalnum())# True
12. String Slicing
To extract a substring using slicing:
s = "slice me" sub = s[2:7] # From 3rd to 7th character print(sub)
13. String Length with len
To get the length of a string:
s = "length" print(len(s)) # 6
14. Multiline Strings
To work with strings spanning multiple lines:
multi = """Line one Line two Line three""" print(multi)
15. Raw Strings
To treat backslashes as literal characters, useful for regex patterns and file paths:
path = r"C:\User\name\folder" print(path)
Working With Web Scraping
1. Fetching Web Pages with requests
To retrieve the content of a web page:
import requests url = 'https://example.com' response = requests.get(url) html = response.text
2. Parsing HTML with BeautifulSoup
To parse HTML and extract data:
from bs4 import BeautifulSoup soup = BeautifulSoup(html, 'html.parser') print(soup.prettify()) # Pretty-print the HTML
3. Navigating the HTML Tree
To find elements using tags:
title = soup.title.text # Get the page title
headings = soup.find_all('h1') # List of all <h1> tags
4. Using CSS Selectors
To select elements using CSS selectors:
articles = soup.select('div.article') # All elements with class 'article' inside a <div>
5. Extracting Data from Tags
To extract text and attributes from HTML elements:
for article in articles:
title = article.h2.text # Text inside the <h2> tag
link = article.a['href'] # 'href' attribute of the <a> tag
print(title, link)
6. Handling Relative URLs
To convert relative URLs to absolute URLs:
from urllib.parse import urljoin absolute_urls = [urljoin(url, link) for link in relative_urls]
7. Dealing with Pagination
To scrape content across multiple pages:
base_url = "https://example.com/page/"
for page in range(1, 6): # For 5 pages
page_url = base_url + str(page)
response = requests.get(page_url)
# Process each page's content
8. Handling AJAX Requests
To scrape data loaded by AJAX requests:
# Find the URL of the AJAX request (using browser's developer tools) and fetch it ajax_url = 'https://example.com/ajax_endpoint' data = requests.get(ajax_url).json() # Assuming the response is JSON
9. Using Regular Expressions in Web Scraping
To extract data using regular expressions:
import re
emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', html)
10. Respecting robots.txt
To check robots.txt for scraping permissions:
from urllib.robotparser import RobotFileParser
rp = RobotFileParser()
rp.set_url('https://example.com/robots.txt')
rp.read()
can_scrape = rp.can_fetch('*', url)
11. Using Sessions and Cookies
To maintain sessions and handle cookies:
session = requests.Session()
session.get('https://example.com/login')
session.cookies.set('key', 'value') # Set cookies, if needed
response = session.get('https://example.com/protected_page')
12. Scraping with Browser Automation (selenium Library)
To scrape dynamic content rendered by JavaScript:
from selenium import webdriver
browser = webdriver.Chrome()
browser.get('https://example.com')
content = browser.page_source
# Parse and extract data using BeautifulSoup, etc.
browser.quit()
13. Error Handling in Web Scraping
To handle errors and exceptions:
try:
response = requests.get(url, timeout=5)
response.raise_for_status() # Raises an error for bad status codes
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
14. Asynchronous Web Scraping
To scrape websites asynchronously for faster data retrieval:
import aiohttp
import asyncio
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.text()
urls = ['https://example.com/page1', 'https://example.com/page2']
loop = asyncio.get_event_loop()
pages = loop.run_until_complete(asyncio.gather(*(fetch(url) for url in urls)))
15. Data Storage (CSV, Database)
To store scraped data in a CSV file or a database:
import csv
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(['Title', 'URL'])
for article in articles:
writer.writerow([article['title'], article['url']])
Оригинал статьи (доступ только через VPN)