cisticola.scraper.telegram_telethon module

class cisticola.scraper.telegram_telethon.TelegramTelethonScraper(telethon_session_name=None)

Bases: Scraper

An implementation of a Scraper for Telegram, using Telethon library

archive_files(result: ScraperResult) ScraperResult

Archive files corresponding to archived_url dict keys, if the files have not previously been archived.

Parameters:

result (ScraperResult) – Previously scraped ScraperResult.

Returns:

Same ScraperResult as result, but with all URLs in archived_url dict archived.

Return type:

ScraperResult

archive_post_media(post: Message)
can_handle(channel)

Whether or not the scraper can scrape the specified channel.

Parameters:

channel (Channel) – Channel to be scraped.

Returns:

True if the scraper is capable of scraping channel, False if not.

Return type:

bool

client = None
get_channel_identifier()
get_posts(channel: Channel, since: ScraperResult | None = None, until: ScraperResult | None = None) Generator[ScraperResult, None, None]

Scrape all posts from the specified Channel.

Parameters:
  • channel (Channel) – Channel to be scraped.

  • since (ScraperResult or None) – Most recently scraped ScraperResult from a previous scrape, or None if scraper has not run before.

Yields:

ScraperResult – Scraper result from a single post/comment from the specified Channel.

get_profile(channel: Channel) RawChannelInfo
get_username_from_url()

Extract a channel’s username from its URL.

Parameters:

url (str) – URL of the channel on a given platform e.g. "https://twitter.com/EliotHiggins"

Returns:

username – Extracted username of the channel. e.g. "EliotHiggins"

Return type:

str