web_programming.emails_from_url =============================== .. py:module:: web_programming.emails_from_url .. autoapi-nested-parse:: Get the site emails from URL. Attributes ---------- .. autoapisummary:: web_programming.emails_from_url.__author__ web_programming.emails_from_url.__email__ web_programming.emails_from_url.__license__ web_programming.emails_from_url.__maintainer__ web_programming.emails_from_url.__status__ web_programming.emails_from_url.__version__ web_programming.emails_from_url.emails Classes ------- .. autoapisummary:: web_programming.emails_from_url.Parser Functions --------- .. autoapisummary:: web_programming.emails_from_url.emails_from_url web_programming.emails_from_url.get_domain_name web_programming.emails_from_url.get_sub_domain_name Module Contents --------------- .. py:class:: Parser(domain: str) Bases: :py:obj:`html.parser.HTMLParser` Find tags and other markup and call handler functions. Usage: p = HTMLParser() p.feed(data) ... p.close() Start tags are handled by calling self.handle_starttag() or self.handle_startendtag(); end tags by self.handle_endtag(). The data between tags is passed from the parser to the derived class by calling self.handle_data() with the data as argument (the data may be split up in arbitrary chunks). If convert_charrefs is True the character references are converted automatically to the corresponding Unicode character (and self.handle_data() is no longer split in chunks), otherwise they are passed by calling self.handle_entityref() or self.handle_charref() with the string containing respectively the named or numeric reference as the argument. .. py:method:: handle_starttag(tag: str, attrs: list[tuple[str, str | None]]) -> None This function parse html to take takes url from tags .. py:attribute:: domain .. py:attribute:: urls :type: list[str] :value: [] .. py:function:: emails_from_url(url: str = 'https://github.com') -> list[str] This function takes url and return all valid urls .. py:function:: get_domain_name(url: str) -> str This function get the main domain name >>> get_domain_name("https://a.b.c.d/e/f?g=h,i=j#k") 'c.d' >>> get_domain_name("Not a URL!") '' .. py:function:: get_sub_domain_name(url: str) -> str >>> get_sub_domain_name("https://a.b.c.d/e/f?g=h,i=j#k") 'a.b.c.d' >>> get_sub_domain_name("Not a URL!") '' .. py:data:: __author__ :value: 'Muhammad Umer Farooq' .. py:data:: __email__ :value: 'contact@muhammadumerfarooq.me' .. py:data:: __license__ :value: 'MIT' .. py:data:: __maintainer__ :value: 'Muhammad Umer Farooq' .. py:data:: __status__ :value: 'Alpha' .. py:data:: __version__ :value: '1.0.0' .. py:data:: emails