Python Web Tools

A collection of tools I commonly use in my web development work


From PyPI (not available yet)

Straight from github

pip install


LoremPysum - Generate random texts

Credits to Luca De Vitis for the inspiration and starter code

Import the class

from py_web_tools import LoremPysum

Create a single LoremPysum instance with default Lorem Ipsum text

p = LoremPysum()

It is also possible to supply the list of words (in a text file) to be used. This is achieved by using the sample parameter during object creation

p = LoremPysum(sample="some_file.txt")

The following instance methods are defined. # return an email address. # return a name in the form "firstname I. lastname".
p.sentence() # generate a single sentence.
p.paragraphs() # return a single paragraph of standard Lorem Ipsum text.
p.paragraphs(count=3) # return 3 paragraphs where the first paragraph is the standard text.
p.paragraphs(common=False) # return a single paragraph where the first paragraph is random.
p.title() # generate a string (title case) with 2 to 6 words. Good for article titles.

In case you want to look into the words used, the following instance attributes are defined.

p.common # A list of the first few words in the lorem ipsum text
p.words # A list of all the words in the lorem ipsum text.
p.standard # Standard lorem ipsum text. Usually the first 1/3rd portion of a sample file.



Lorem Pysum: Name, email, title, sentence and paragraph generator

class py_web_tools.lorem_pysum.LoremPysum(sample=None)[source]

Generate random sentences and paragraphs

Parameters:sample (file, optional) – a file containing the text to be used as sample. Default is Lorem Ipsum text.



Return an email address


Return any name with a middle initial.

paragraphs(count=1, common=True)[source]

Return paragraphs

  • count (int) – The number of required paragraph. Default is 1
  • common (bool) – Whether the first paragraph will be the standard lorem ipsum text. Default is True

Return a sentence


The first word is capitalized, and the sentence ends in either a period or question mark. Commas are added at random.

Determine the number of comma-separated sections and number of words in each section for this sentence.


return a title consisting of between 2 to 6 words


Create a BeautifulSoup object of a webpage

class py_web_tools.page_ripper.PageRipper(url='')[source]

Harvest words and links from a webpage

Parameters:str – Page url. Default is ‘



  1. PageRipper(‘’).soup
  2. PageRipper(‘’).page_soup(to_file=’no‘)
  3. PageRipper(‘‘).raw_links()
  4. PageRipper(‘‘).links()
  5. PageRipper(‘‘).words()


Return all crawlable links (clickable url) on webpage

Yields:str – Clickable url


Links with “#” are excluded


Harvest all words enclosed in <p> tags in webpage source

Yields:str – Single word which is not in list of excluded words


Indexing functions

py_web_tools.indexing.add_page_to_index(word_index, page_url)[source]

Add all words found in a webpage to the word index

  • word_index (dict) – Index of words
  • page_url (str) – url from which words are to be extracted

Word index with entries added/updated

Return type:



Modifies the input dictionary in place

py_web_tools.indexing.add_to_index(word_index, word, page_url)[source]

Add a word to word index and adds a page url to the list of urls associated with that word

  • word_index (dict) – Index of words
  • word (str) – Word to be added to the index
  • page_url (str) – url to be added in the list of urls associated with “word”

Word index with “word” and “page_url” added/updated.

Return type:



This function modifies the input dictionary in-situ (in place)

Indices and tables