cisticola.scraper.bitchute module

class cisticola.scraper.bitchute.BitchuteScraper

An implementation of a Scraper for Bitchute, using classes from the 4cat library

can_handle(channel)

Whether or not the scraper can scrape the specified channel.

Parameters:: channel (Channel) – Channel to be scraped.
Returns:: True if the scraper is capable of scraping channel, False if not.
Return type:: bool

get_posts(channel: Channel, since: ScraperResult | None = None) → Generator[ScraperResult, None, None]

Scrape all posts from the specified Channel.

Parameters:

channel (Channel) – Channel to be scraped.
since (ScraperResult or None) – Most recently scraped ScraperResult from a previous scrape, or None if scraper has not run before.

Yields:

ScraperResult – Scraper result from a single post/comment from the specified Channel.

get_username_from_url(url)

Extract a channel’s username from its URL.

Parameters:: url (str) – URL of the channel on a given platform e.g. "https://twitter.com/EliotHiggins"
Returns:: username – Extracted username of the channel. e.g. "EliotHiggins"
Return type:: str

cisticola.scraper.bitchute.append_details(video, detail)

Append extra metadata to video data

Fetches the BitChute video detail page to scrape extra data for the given video.

Parameters:

Return dict:

Tuple, first item: updated video data, second: list of comments

cisticola.scraper.bitchute.decode_cfemail(cfemail): https://stackoverflow.com/questions/36911296/scraping-of-protected-email

cisticola.scraper.bitchute.get_videos_user(session, user, csrftoken, detail)

Scrape videos for given BitChute user

Parameters:

Returns:

Video data dictionaries, as a generator

cisticola.scraper.bitchute.request_from_bitchute(session, method, url, headers=None, data=None)

Request something via the BitChute API (or non-API)

To avoid having to write the same error-checking everywhere, this takes care of retrying on failure, et cetera

Parameters:

Returns:

Requests response

cisticola.scraper.bitchute.strip_tags(html, convert_newlines=True)

Strip HTML from a string

Parameters:

Returns:

Stripped HTML