cisticola.scraper.bitchute module

class cisticola.scraper.bitchute.BitchuteScraper

Bases: Scraper

An implementation of a Scraper for Bitchute, using classes from the 4cat library

can_handle(channel)

Whether or not the scraper can scrape the specified channel.

Parameters:

channel (Channel) – Channel to be scraped.

Returns:

True if the scraper is capable of scraping channel, False if not.

Return type:

bool

get_posts(channel: Channel, since: ScraperResult | None = None) Generator[ScraperResult, None, None]

Scrape all posts from the specified Channel.

Parameters:
  • channel (Channel) – Channel to be scraped.

  • since (ScraperResult or None) – Most recently scraped ScraperResult from a previous scrape, or None if scraper has not run before.

Yields:

ScraperResult – Scraper result from a single post/comment from the specified Channel.

get_profile(channel: Channel) RawChannelInfo
get_username_from_url(url)

Extract a channel’s username from its URL.

Parameters:

url (str) – URL of the channel on a given platform e.g. "https://twitter.com/EliotHiggins"

Returns:

username – Extracted username of the channel. e.g. "EliotHiggins"

Return type:

str

cisticola.scraper.bitchute.append_details(video, detail)

Append extra metadata to video data

Fetches the BitChute video detail page to scrape extra data for the given video.

Parameters:
  • video (dict) – Video details as scraped so far

  • detail (str) – Detail level. If ‘comments’, also scrape video comments.

Return dict:

Tuple, first item: updated video data, second: list of comments

cisticola.scraper.bitchute.decode_cfemail(cfemail)

https://stackoverflow.com/questions/36911296/scraping-of-protected-email

cisticola.scraper.bitchute.get_videos_user(session, user, csrftoken, detail)

Scrape videos for given BitChute user

Parameters:
  • session – HTTP Session to use

  • user (str) – Username to scrape videos for

  • csrftoken (str) – CSRF token to use for requests

  • detail (str) – Detail level to scrape, basic/detail/comments

Returns:

Video data dictionaries, as a generator

cisticola.scraper.bitchute.request_from_bitchute(session, method, url, headers=None, data=None)

Request something via the BitChute API (or non-API)

To avoid having to write the same error-checking everywhere, this takes care of retrying on failure, et cetera

Parameters:
  • session – Requests session

  • method (str) – GET or POST

  • url (str) – URL to fetch

  • header (dict) – Headers to pass with the request

  • data (dict) – Data/params to send with the request

Returns:

Requests response

cisticola.scraper.bitchute.strip_tags(html, convert_newlines=True)

Strip HTML from a string

Parameters:
  • html – HTML to strip

  • convert_newlines – Convert <br> and </p> tags to n before stripping

Returns:

Stripped HTML